Mathematics:  A  Third  Level  ( 

9 


TheOpen 

University 


Unit  2 

Random  processes 


1  1 

J  h-a1 
1 

1  1 

fti  1 

OBABIUTY 


M343  APPLICATIONS  OF  PROBABILITY 

Mathematics:  A  Third  Level  Course 


9 

TheOpen 

University 


Unit  2 

Random  processes 


Prepared  by  the  Course  Team 


CONTENTS 


Introduction 

3 

1 

More  on  probability:  related  variables 

4 

1 . 1  Discrete  bivariate  distributions 

4 

1.2  Continuous  bivariate  distributions 

9 

2 

Random  processes 

11 

2. 1  What  is  a  random  process? 

11 

2.2  The  Bernoulli  process 

15 

2.3  Further  examples  of  random  processes 

19 

3 

The  Poisson  process 

23 

3. 1  Basic  ideas  and  results 

27 

3.2  A  more  formal  approach  to  the  Poisson  process 

31 

3.3  The  multivariate  Poisson  process 

36 

4 

Extensions  of  the  Poisson  process 

38 

4. 1  The  non-homogeneous  Poisson  process 

38 

4.2  The  compound  Poisson  process 

47 

5 

Point  processes 

51 

5.1  Types  of  point  process 

52 

5.2  The  index  of  dispersion 

52 

6 

Further  examples  of  random  processes 

54 

6.1  The  simple  birth  process 

54 

6.2  Further  examples 

55 

7 

Deterministic  models 

56 

Objectives 

61 

Appendix:  Solutions  to  questions 

62 

The  Open  University 


Statistics  tables 


The  recommended  book  of  statistics  tables  for  this  course  is  H.  R.  Neave, 
Elementary  Statistics  Tables  (Unwin  Hyman,  1981).  In  this  unit,  these  tables  are 
referred  to  as  Neave. 


Unit  titles 


1 

Probability  and  Random  Variables 

9 

Queues 

2 

Random  Processes 

10 

Epidemics 

3 

Patterns  in  Space 

11 

More  Population  Models 

4 

Branching  Processes 

12 

Genetics 

5 

Random  Walks 

13 

Renewal  Models 

6 

Markov  Chains 

14 

Diffusion  Processes 

7 

Birth  Processes 

8 

Birth  and  Death  Processes 

16 

Problems,  Problems,  Problems 

The  Open  University,  Walton  Hall,  Milton  Keynes,  MK7  6AA. 

First  published  1997.  Reprinted  1998,  2001,  2003 
Copyright  ©  1997  The  Open  University 

All  rights  reserved;  no  part  of  this  publication  may  be  reproduced,  stored  in  a  retrieval  system, 
or  transmitted  in  any  form  or  by  any  means,  electronic,  mechanical,  photocopying,  recording,  or 
otherwise  without  either  the  prior  written  permission  of  the  Publishers  or  a  licence  permitting 
restricted  copying  issued  by  the  Copyright  Licensing  Agency,  90  Tottenham  Court  Road, 
London,  W1P  9HE.  This  publication  may  not  be  lent,  resold,  hired  out  or  otherwise  disposed  of 
by  way  of  trade  in  any  form  of  binding  or  cover  other  than  that  in  which  it  is  published,  without 
the  prior  consent  of  the  Publishers. 

Edited,  designed  and  typeset  by  the  Open  University  using  the  Open  University  Tj^X  System. 
Printed  in  the  United  Kingdom  by  the  Alden  Group,  Oxford. 

ISBN  0  7492  7852  8 

This  text  forms  part  of  an  Open  University  Third  Level  Course.  If  you  would  like  a  copy  of 
Studying  with  The  Open  University,  please  write  to  the  Central  Enquiry  Service,  PO  Box  200, 
The  Open  University,  Walton  Hall,  Milton  Keynes,  MK7  6YZ.  If  you  have  not  already  enrolled 
on  the  Course  and  would  like  to  buy  this  or  other  Open  University  material,  please  write  to 
Open  University  Educational  Enterprises  Ltd,  12  Cofferidge  Close,  Stony  Stratford,  Milton 
Keynes,  MK11  1BY,  United  Kingdom. 


2.4 


Introduction 


In  this  unit  the  idea  of  a  random  variable  is  extended,  and  you  will  learn  what  is 
meant  by  the  phrase  random  process.  The  essential  property  of  a  random  process 
is  that  it  consists  of  an  ordered  list  or  sequence  of  observations  on  a  random 
variable,  and  we  are  interested  in  the  probability  distribution  of  items  in  the  list 
and  in  how  the  values  of  early  observations  affect  subsequent  values. 

For  instance,  we  might  be  interested  in  the  size  of  a  closed  population  such  as  an 
ant  colony  or  a  worm  colony,  and  how  the  size  of  the  population  alters  with  the 
passage  of  time.  If  the  colony  is  observed  and  its  population  counted  at  regular 
intervals,  then  we  are  likely  to  notice,  in  the  early  stages  of  development  at  least, 
a  generally  smooth  growth,  but  with  some  variation  in  the  population  size,  caused 
by  chance.  Perhaps  later  in  its  development,  some  kind  of  environmental  stability 
may  be  attained,  and  the  population  size  will  level  out;  but  still  we  would  expect 
to  see  some  continuing  evidence  of  variation  in  the  colony  size,  through  the  chance 
effects  of  birth  and  death. 

Denoting  by  the  random  variable  Xn  the  size  of  the  population  n  weeks  after  the 
colony  is  started,  then  we  may  be  interested  in  forecasting  the  future  size  of  the 
colony  based  on  observations  Xq  (the  starting  size)  and  weekly  observations 
Xi,X2,X3, . . . ,  Xio  (say).  What  do  these  data  tell  us  about  the  dynamics  of  the 
population?  Can  we  devise  probability  models  for  the  sequence  of  observations? 
Can  we  use  them  to  estimate  parameters  (the  birth  rate  and  the  death  rate,  say) 
for  the  colony?  What  can  we  say  about  the  value  of  X2q,  or  Aboo?  Can  we  say 
anything  about  the  eventual  fate  of  the  colony  other  than  by  observing  it  for  as 
long  as  it  survives? 

The  unit  begins  with  some  brief  further  work  on  random  variables  and  then  goes 
on,  in  Sections  2  to  4,  to  commence  our  study  of  random  processes,  which  we 
begin  in  a  simple  way  by  looking  at  sequences  of  events  and  their  time  of 
occurrence.  Basic  models  for  the  occurrence  of  events  occurring  randomly  in  time 
are  the  Bernoulli  process  and  the  Poisson  process.  These  basic  models  are 
described  at  first  intuitively  and  then  more  formally;  and  then  more  sophisticated 
extensions  of  the  Poisson  process  are  described,  in  order  to  provide  adequate 
models  for  more  complicated  random  situations. 

A  technique  that  may  be  used  to  compare  some  random  processes  with  a  Poisson 
process  is  introduced  in  Section  5;  and  a  few  further  examples  of  random 
processes  are  discussed  briefly  in  Section  6. 

The  final  section  of  the  unit  deals  with  deterministic  models  for  random  processes: 
in  a  deterministic  model,  the  element  of  chance  is  removed  and  what  we  are  left 
with  is  an  average  (or  ‘most  likely’)  representation  of  the  development  of  a  process. 

In  Subsection  3.2,  the  mathematical  derivation  of  a  useful  and  important  result  in 
probability  is  followed  in  detail.  Similar  detailed  expositions  are  to  be  found 
scattered  throughout  the  course  (and  particularly  in  Units  7  and  8).  In  general, 
however,  it  is  the  result  and  its  application  that  are  important;  and,  except  in  rare 
cases,  you  will  not  be  expected  to  reproduce  for  assessment  purposes  the 
mathematical  arguments  leading  to  the  statement  of  general  results. 

The  first  four  sections  of  the  unit  are  the  longest  and  the  most  important. 

Section  2  includes  a  lot  of  general  reading  and  relatively  little  mathematics. 
Section  4  includes  a  video  session.  The  video  band  lasts  only  about  five  minutes, 
but  you  may  wish  to  watch  parts  of  it  more  than  once;  so  allow  up  to  fifteen 
minutes  of  study  time  for  it. 

There  are  extra  exercises  on  the  topics  covered  in  this  unit  in  the  Problem  Booklet 
for  Units  1  and  2.  You  wall  also  find  a  few  questions  on  the  material  in  this  unit 
in  Unit  16  (namely  Questions  1.9-1.13  in  the  section  headed  Unit  1  and 
Questions  2. 1-2.5  in  the  section  headed  Unit  2). 

There  is  no  audio  component  associated  with  this  unit. 


1  More  on  probability:  related  variables 


We  begin  with  one  more  application  of  probability  theory  that  will  be  useful  for 
the  study  of  random  processes.  Some  of  the  ideas  introduced  in  Unit  1  concerning 
relationships  between  two  or  more  variables  are  brought  together  and  extended. 

In  Subsection  1.1,  the  joint  distribution  of  two  discrete  random  variables  is 
discussed,  the  idea  of  independence  of  such  random  variables  is  explored  and  the 
notion  of  a  conditional  distribution  is  introduced.  Also,  conditional  expectation  is 
defined.  (This  last  idea  is  important  for  simplifying  the  calculation  of  some  means 
and  variances,  and  wre  shall  need  it  in  Subsection  4.2.)  In  Subsection  1.2,  these 
ideas  are  extended  to  continuous  random  variables.  The  examples  used  have  been 
kept  very  simple  so  that  the  main  ideas  will  not  be  obscured  by  the  need  for 
complicated  calculations. 


1.1  Discrete  bivariate  distributions 

Joint  and  marginal  distributions 

The  joint  probability  function  p(x ,  y)  for  two  discrete  random  variables  X  and 
Y  is  defined  by 

p{x,y)  =  P(X  =  x,Y  =  y),  xeflx,  ye  fly- 

The  distribution  of  X  alone  may  be  obtained  by  summing  the  joint  probability 
function  over  all  possible  values  of  y  to  give  the  probability  function  of  X. 
Similarly,  the  distribution  of  Y  is  obtained  by  summing  the  joint  probability 
function  over  all  values  of  x.  This  is  illustrated  in  the  following  simple  example. 

Example  1.1 

For  the  last  few  years  Mary  has  kept  in  touch  with  three  friends  by  post.  She  has 
noticed  that  she  never  hears  from  more  than  two  of  them  in  any  week;  frequently 
she  receives  only  one  letter  and  sometimes  none  at  all.  She  doesn’t  write  to  them 
every  week,  but  when  she  does,  she  writes  to  all  three  of  them.  If  X  is  the  number 
of  letters  she  writes  and  Y  is  the  number  she  receives  in  a  week,  then  Table  1.1 
gives  the  joint  probability  function  of  X  and  Y. 

Table  1.1  The  joint  probability  function  of  X  and  Y 


0 

y 

l 

2 

0 

X 

0.1 

0.4 

0.1 

3 

0 

0.2 

0.2 

The  entries  in  Table  1.1  are  values  of  the  joint  probability  function  p{x,y)  of  X 
and  Y.  The  probability  function  of  X  is  found  by  summing  across  the  rows:  for 
example, 

P(X  =  0)  =  P{X  =  0,  Y  =  0)  +  P{X  =  0,  Y  =  1)  +  P{X  =  0,  Y  =  2), 

and  so 

Px(  0)  =  p(0;  0)  +  p(0, 1)  +  p(  0, 2)  =  0.1  +  0.4  +  0.1  =  0.6. 

Similarly, 

Px(3)  =  p{  3, 0)  +  p{  3, 1)  +  p{  3, 2)  =  0  +  0.2  +  0.2  =  0.4. 


4 


So,  for  each  x  G  Ox ,  we  have 

px{x)  =  v(x,y), 

yefly 

and  X  has  the  following  probability  function. 


X 

0 

3 

px(x) 

0.6 

0.4 

It  follows  that  Mary  writes  no  letters  0.6  of  the  time,  that  is  in  three  weeks  out  of 
five,  and  writes  to  all  three  friends  in  two  weeks  out  of  five. 

In  a  similar  way,  the  probability  function  of  Y  is  obtained  by  summing  down  the 
columns: 

py(0)=p(0,0)+p(3,0)  =  0.1; 
py(l)=p(0,l)+p(3,l)  =  0.6; 

Py(2)  =  p(0, 2)  +  p(3, 2)  =  0.3. 

That  is,  for  each  y  G  fly, 

Pv{y)  =  Y2 

and  Y  has  the  following  probability  function. 


y 

0  1  2 

py(y) 

0.1  0.6  0.3 

This  is  the  distribution  of  the  number  of  letters  Mary  receives  in  a  week. 

The  probabilities  px{x)  and  py(y)  may  conveniently  be  written  in  the  margins  of 
the  table  for  p(x,y ),  as  in  Table  1.2. 

Table  1.2 


0 

y 

i 

2 

0 

0.1 

0.4 

0.1 

0.6 

X 

3 

0 

0.2 

0.2 

0.4 

0.1 

0.6 

0.3 

The  distributions  of  X  alone  and  of  Y  alone,  when  obtained  from  the  joint 
distribution  of  X  and  Y,  are  often  called  the  marginal  distributions  of  X  and 
Y ;  the  word  marginal  is  used  because  these  distributions  are  given  by  the  entries 
in  the  margins  of  the  table  of  joint  probabilities.  (Notice  that  the  sum  of  the 
entries  in  each  marginal  distribution  is  1.)  The  rule  for  finding  the  marginal 
distributions  from  the  joint  distribution  is  summarized  in  the  box  below. 


If  the  discrete  random  variables  X  and  Y  have  joint  probability  function 
p(x,y),  x  G  V  €  Oy,  then  the  marginal  distribution  of  X  is  given  by 
the  probability  function 

Px{x)  =  Y2  p(x,y),  xeOx, 

y€.&Y 

and  the  marginal  distribution  of  Y  is  given  by 

pr{y)=  Y2  p(x,2/)>  y£ttY. 

xEtlx 


5 


Question  1.1  The  random  variables  X  and  Y  have  the  joint  probability 
function  given  in  Table  1.3. 

Table  1.3  A  joint  probability  function 


0 

y 

i 

2 

1 

X 

0.1 

0.2 

0.2 

2 

0.1 

0.3 

0.1 

Find  the  marginal  distributions  of  X  and  Y.  □ 

Conditional  distributions 

You  have  seen  how  the  joint  distribution  of  two  discrete  random  variables  can  be 
represented  by  a  joint  probability  function  and  how  to  obtain  the  marginal 
distributions  of  the  two  random  variables  from  the  joint  probability  function. 

Now  suppose  that  the  value  of  one  random  variable  is  observed.  What  does  this 
tell  us  about  the  value  of  the  other? 

Example  1.2 

In  Example  1.1,  if  it  is  known  that  Mary  wrote  three  letters  one  week,  what  is  the 
probability  that  she  received  one  letter  that  week? 

To  answer  this,  we  must  find  the  conditional  probability  P(Y  =  l\X  =  3).  By  the 
definition  of  conditional  probability, 

P(Y  =  l\X  =  3) 

Similarly, 

P(Y  =  0\X  =  3) 

P(Y  =  2\X  =  3) 

So  the  distribution  of  the  number  of  letters  Mary  receives  (Y)  given  that  she 
writes  three  (X  =  3)  is  as  follows. 


y 

0  1  2 

P(Y  =  y\X  =  3) 

0  0.5  0.5 

p([y  =  i]n[x  =  3])  p(3,i)_o.2_ 

P(X  =  3)  px(  3)  0.4  •• 


See  Formula  (1.4)  in  Unit  1. 


p(3,  0)  _  n 
P*(3)  ’ 

p(3,2)  =  0.2 
Px(  3)  0.4 


The  notation  p(y  |3)  is  used  for  this  probability  function.  The  distribution  is  called 
the  conditional  distribution  of  Y  given  X  =  3.  □ 

In  general,  the  notation  p{y\x)  is  used  for  the  probability  function  of  the 
conditional  distribution  of  Y  given  X  =  x.  Note  that  this  distribution  is  a 
univariate  distribution  of  the  random  variable  Y  for  a  fixed  value  x  of  X. 

Question  1.2  Find  the  conditional  distribution  of  the  number  of  letters  Mary 
receives  in  a  week  given  that  she  writes  no  letters  that  week;  that  is,  find  the 
probability  function  p(y |0).  □ 

In  exactly  the  same  way  we  can  define  the  probability  function  of  the  conditional 
distribution  of  X  given  Y  —  y.  This  is  denoted  by  p(x\y)  and,  provided 
P(Y  =  y)  #  0, 

P([X  =  x]n[Y  =  y})  p^y)_ 

P[lV)  P(Y  =  y)  pY(y) ' 

Question  1.3  Find  the  conditional  distribution  of  the  number  of  letters  Mary 
writes  in  a  week  given  that  she  receives  one  letter;  that  is,  find  the  probability 
function  p(.x’|l).  □ 


6 


The  conditional  distributions  described  above  are  defined  formally  in  the  box 
below. 


If  X,Y  are  discrete  random  variables  whose  joint  probability  function  is 
p(x,y)  and  whose  marginal  distributions  have  probability  functions  px{x), 
py(y)  respectively,  then  the  conditional  distributions  of  Y  given  X  =  x 
and  of  X  given  Y  =  y  have  probability  functions  denoted  by  p(y\x),  p{x\y) 
respectively  and 

p^x)  =  ?tw)  if»M>0’  (i-i) 

p{x\y)  =  i{py(y)>°-  (i-2) 


Independence 

In  Subsection  2.1  of  Unit  1 ,  two  random  variables  X  and  Y  were  defined  to  be 
independent  if  the  occurrence  of  any  event  associated  with  one  of  them,  X  say, 
does  not  depend  on  the  occurrence  of  any  event  associated  with  the  other.  It 
follows  that  if  X  and  Y  are  independent  discrete  random  variables  then,  for  all 
y  €  Oy  and  for  all  x  6  Ox, 

P(X  =  x\Y  =  y)  =  P(X  =  *); 

that  is, 

p(x\y)  =  px{x),  for  all  x  €  Ox, 2/  €  Oy. 

Since,  by  Definition  (1.2), 

P{*\y)  ~Ppy^y  Provided  py(v)  ±  o, 

it  follows  that 

p(x,y)  =pX{x)pY(y),  for  all  x  G  ttx,y  €  Oy. 

This  is  the  condition  for  independence  that  was  stated  without  proof  in  Unit  1.  It 
can  be  used  to  provide  a  formal  definition  of  independence,  which  is 
straightforward  to  use  in  practice. 


Two  discrete  random  variables  X  and  Y  are  independent  if 

P(x,y)  =Px{x)pY{y),  for  all  x  e  Ox,y  G  Oy.  (1.3) 


Example  1.1,  continued 

The  joint  distribution  of  the  number  of  letters  Mary  receives  and  the  number  she 
writes  in  a  week  is  repeated  in  Table  1.4  below.  The  marginal  distributions  are 
also  included. 

Table  1.4 


0 

y 

i 

2 

0 

0.1 

0.4 

0.1 

0.6 

X 

3 

0 

0.2 

0.2 

0.4 

0.1 

0.6 

0.3 

Since  p{ 0, 0)  =  0.1  and  px(0)pY(0)  =  0.6  x  0.1  =  0.06, 

p(0,0)  ^px(0)py(0), 

so  X  and  Y  are  not  independent.  □ 


Note  that  it  is  only  necessary  to  find  one  pair  of  values  x,  y  for  which  the 
Condition  (1.3)  is  not  satisfied  to  show  that  two  random  variables  are  not 
independent.  However,  to  show  that  two  random  variables  are  independent,  you 
must  show  that  the  Condition  (1.3)  is  satisfied  for  all  x  £  £lx,  V  €  f2y.  This  is 
illustrated  in  the  next  example. 

Example  1.3 

The  joint  probability  function  of  two  discrete  random  variables  X  and  Y  is  given 
in  Table  1.5.  The  marginal  distributions  are  also  included. 

Table  1.5 


0 

y 

1 

2 

0 

0.2 

0.4 

0.2 

0.8 

X 

1 

0.05 

0.1 

0.05 

0.2 

0.25 

0.5 

0.25 

Notice  that 

MO)py(O)  =  0.8  x  0.25  =  0.2  =  p(0,0), 

Px{l)PY (0)  -  0.2  x  0.25  =  0.05  =  p(l,  0), 
and  so  on.  In  fact 

p{x,y)  =px(x)pY(y) 

for  all  x  £  itx,  y  €  fiy,  and  so  X  and  Y  are  independent.  □ 

Question  1.4  For  each  of  the  following  joint  probability  functions,  decide 
whether  or  not  X  and  Y  are  independent. 


i) 

y 

0 

1 

(ii) 

y 

0 

1 

(iii) 

V 

0 

1 

0 

0.3 

0.2 

0 

0.2 

0.3 

0 

0.45 

0.3 

*  1 

0.3 

0.2 

*  1 

0.4 

0.1 

*  1 

0.15 

0.1 

Conditional  expectation 

It  was  noted  earlier  that  the  conditional  distribution  of  a  random  variable  Y  given 
X  =  x  is  a  univariate  distribution.  It  follows  that,  for  each  value  x,  the 
distribution  has  a  mean  or  expectation.  The  mean  of  the  conditional  distribution 
of  y  given  X  =  x,  which  is  denoted  E(Y \X  =  x),  is  called  the  conditional 
expectation  of  Y  given  X  =  x.  For  discrete  random  variables  X  and  Y, 

e(y\x  =  x)=J2  yp(y\ *)• 

?/G£2y 

Example  1.1,  continued 

The  mean  number  of  letters  Mary  receives  in  a  week  when  she  writes  three  letters 
is  given  by  the  mean  of  the  conditional  distribution  of  Y  given  X  —  3;  that  is, 
E(Y\X  =  3).  The  conditional  probability  function  p(y|3),  which  was  found 
earlier,  is  given  below. 


y 

0  1  2 

p(y|3) 

o 

p 

Cl 

p 

bi 

Therefore 

E{Y I*  =  3)  =  0  x  0  +  1  x  0.5  +  2  x  0.5  =  1.5.  □ 

In  a  similar  way,  the  conditional  expectation  of  X  given  Y  =  y  is  defined  to  be  the 
mean  of  the  conditional  distribution  of  X  given  Y  =  y.  If  X  and  Y  are  discrete, 

E{X\Y  =  y)  =  xP(x\v)- 
xGQx 

8 


Question  1.5 

(i)  Find  the  mean  number  of  letters  Mary  receives  in  a  week  when  she  does  not 
write  any  letters. 

(ii)  Find  the  mean  number  of  letters  Mary  writes  in  a  week  when  she  receives 
one  letter. 

(Hint:  Use  your  answers  to  Questions  1.2  and  1.3.)  □ 

The  following  question  should  help  you  to  familiarize  yourself  with  some  of  the 
ideas  introduced  in  this  subsection.  You  will  find  additional  exercises  in  the 
Problem  Booklet. 

Question  1.6  The  discrete  random  variables  X  and  Y  have  the  joint 
probability  distribution  specified  by  the  following  table. 


y 

0 

l 

2 

1 

0.05 

0.2 

0.1 

*  2 

0.2 

0.15 

0.3 

(i)  Find  the  marginal  distributions  of  X  and  Y . 

(ii)  Calculate  the  expectations  E(Y\X  =  1)  and  E(X\Y  =  2).  □ 

In  this  subsection,  some  ideas  concerning  relationships  between  two  discrete 
random  variables  have  been  introduced.  In  particular,  conditional  distributions 
and  conditional  expectation  have  been  defined.  These  will  be  important  in  this 
unit  and  in  later  units  and  you  should  make  sure  you  are  familiar  with  their 
definitions. 

Similar  ideas  and  results  apply  for  continuous  random  variables.  Some  familiarity 
with  joint  and  conditional  distributions  for  such  variables  is  needed  in  order  to 
follow  the  derivation  of  one  of  the  results  in  Unit  13.  A  summary  of  the  main 
ideas  is  contained  in  the  following  subsection. 


1.2  Continuous  bivariate  distributions 


This  subsection  contains  a  summary  of  basic  ideas  concerning  continuous 
bivariate  distributions.  This  is  provided  for  you  to  refer  to  if  you  are  interested. 
None  of  the  material  in  this  subsection  will  be  assessed. 

Joint  and  marginal  distributions 

The  joint  distribution  of  two  discrete  random  variables  X  and  Y  is  defined  by  the 
joint  probability  function  p(x,y').  Correspondingly,  the  joint  distribution  of  two 
cont  inuous  random  variables  is  defined  by  a  joint  probability  density 
function. 


You  will  recall  from  Subsection  4.1  of  Unit  1  that  a  p.d.f.  is  a  function  f(x)  with 
the  following  properties: 

(i)  f(x)  >0,  x  e  R; 

(ii)  P(a<  X<b)=  f  f(x)dx; 

J  a 

r° ° 

(iii)  /  f(x)dx  =  l. 

The  distribution  of  a  random  variable  X  is  completely  determined  by  the  p.d.f. 

The  graph  of  a  p.d.f.  f\x)  is  a  curve  in  two  dimensions.  Property  (i)  states  that 
the  graph  of  the  p.d.f.  always  lies  on  or  above  the  x-axis;  and  Property  (iii)  states 
that  the  total  area  between  the  p.d.f.  and  the  rr-axis  is  equal  to  1.  Property  (ii) 
describes  how  the  p.d.f.  is  used  to  calculate  probabilities:  probabilities  are  given 
by  areas  between  the  p.d.f.  and  the  z-axis.  This  is  illustrated  in  Figure  1.1. 


fix)  i  P(a  <x<b) 


a  b  x 


Figure  1.1  A  p.d.f.:  the  shaded 
area  gives  P(a  <  X  <b) 


9 


The  joint  distribution  of  two  continuous  random  variables  X  and  Y  can  be 
completely  specified  by  a  joint  p.d.f.  f(x,y).  The  function  z  =  f(x,y )  represents  a 
surface  in  three  dimensions,  the  height  of  the  surface  above  the  x-y  plane  at  the 
point  ( x,y )  being  2  =  f(x,y).  If  f(x,y )  is  the  joint  p.d.f.  of  two  continuous 
random  variables  then  volumes  of  regions  between  this  surface  and  the  x-y  plane 
represent  probabilities. 

A  joint  p.d.f.  has  the  following  three  properties  corresponding  to 
Properties  (i)-(iii)  for  a  p.d.f.  f(x). 

I  f(x,y )  >  0  for  all  x  and  y. 

II  The  probability  that  the  random  variables  X  and  Y  take  a  pair  of  values 
(a-,  y)  within  a  region,  A ,  of  the  x-y  plane  is  given  by  the  volume  contained 
between  the  surface  2  =  f{x,y)  and  the  area  A  in  the  x-y  plane.  (This  is 
illustrated  for  f(x,y )  =  x  +  y  in  Figure  1.2.) 

III  The  total  volume  contained  between  the  surface  2  =  /(a,  y)  and  the  x-y  plane 
is  1. 

For  two  discrete  random  variables,  X  and  Y,  with  joint  probability  function 
p(a,y),  the  marginal  distribution  of  X  is  found  by  summing  p(x,y)  over  all 
possible  values  of  y;  i.e. 


Figure  1.2  The  volume  shown 
gives  the  probability  that  X  and  Y 
take  values  (x.  y)  within  A 


Px{x)=  P(xiV)’  xenx- 

In  general,  wherever  a  sum  occurs  with  discrete  random  variables,  an  integral 
occurs  for  continuous  distributions,  and  in  this  case  if  X  and  Y  have  joint  p.d.f. 
f(x,y),  then  fx{x ),  the  marginal  p.d.f.  of  X ,  is  given  by 

/oo 

.f{x,y)  dy,  -00  <  a;  <00. 

■00 


Example  1.4 

The  random  variables  X  and  Y  have  joint  p.d.f. 


f(xty)  =  x  +  y,  0  <  *  <  1,  0  <  y  <  1. 

When  integrating  f{x,y)  with  respect  to  y,  x  is  treated  as  if  it  were  a  constant. 
So,  for  0  <  x  <  1, 


xy  +  \y‘' 


□ 


Similarly,  the  marginal  p.d.f.  of  Y  is  given  by 

/oo 

f(x,y)dx,  -00  <  y  <  00. 

-00 


Conditional  distributions,  independence  and  conditional  expectation 


Given  two  continuous  random  variables  X  and  Y,  the  p.d.f.  of  a  conditional 
distribution  is  defined  analogously  to  the  probability  function  of  a  conditional 
distribution  involving  two  discrete  random  variables.  The  conditional  p.d.f.  of 
Y  given  X  —  x  is  written  f(y\x)  and  is  given  by 


f{y\x) 


f{x,  y) 

fx{x) 


for  any  x  such  that  fx{x)  >  0. 


Similarly,  the  conditional  p.d.f.  of  X  given  Y  =  y,  written  f(x\y),  is  given  by 


f{x\y)  = 


f{x,y) 

fv(y) 


for  any  y  such  that  fy(y)  >  0. 


10 


Continuing  the  analogy  with  discrete  variates,  a  definition  of  the  independence  of 
two  continuous  random  variables  is  as  follows. 


Two  continuous  random  variables  X  and  Y  are  said  to  be  independent  if 
f(x,y)  =  fx{x)fy(y)  for  all  values  of  x  and  y. 


The  conditional  expectation  of  Y  given  X  =  x  and  the  conditional  expectation  of 
X  given  Y  =  y  are  defined  analogously  with  conditional  expectations  for  discrete 
random  variables.  Integration  replaces  summation  and  p.d.f.s  replace  probability 
functions.  For  continuous  random  variables  X  and  Y,  the  conditional  expectation 
of  Y  given  X  =  x  is  given  by 

/oo 

yf(y\x)dy. 

■oo 

In  a  similar  way,  the  conditional  expectation  of  X  given  Y  =  y  is  defined  to  be 

/oo 

xf(x\y)dx. 

■oo 

As  mentioned  at  the  end  of  Subsection  1.2,  some  familiarity  with  joint  and 
conditional  distributions  for  continuous  random  variables  is  required  in  order  to 
follow  the  derivation  of  a  result  in  Unit  13.  You  may  wish  to  refer  to  this 
subsection  when  you  study  this  part  of  Unit  13.  You  will  not  be  expected  to 
apply  any  of  the  ideas  from  this  subsection  to  solve  problems. 


2  Random  processes 


In  this  section  the  fundamental  ideas  of  random  processes  are  discussed.  These 
ideas  are  introduced  in  Subsection  2.1,  where  a  number  of  examples  are  described. 
In  Subsection  2.2,  the  Bernoulli  process  is  described  and  its  main  properties  are 
explored.  The  notation  for  random  processes  is  discussed  in  Subsection  2.3  and  a 
few  further  examples  are  described. 


2.1  What  is  a  random  process? 

Consider  the  situation  in  the  following  example. 

Example  2.1  Gambler’s  ruin 

Two  players,  A  and  B,  with  initial  capital  £k  and  £(o  -  k)  respectively  (where  a, 
k  are  both  positive  integers  and  a  >  k),  engage  in  a  series  of  games  that  involve 
some  element  of  chance.  After  each  game,  the  loser  pays  the  winner  £1.  The 
sequence  continues  until  one  player  has  lost  all  his  money;  he  is  ruined. 

The  situation  can  be  described  in  terms  of  a  sequence,  {Xn},  of  random  variables 
where  £Xn  is  A  s  capital  after  n  games.  The  series  ends  when  either  Xn  =  0  (A  is 
ruined)  or  Xn  —  a  ( B  is  ruined).  If  k  =  4,  a  =  7,  then  a  typical  realization  of 
{Xn}  for  n  =  0, 1, . . . ,  11  is 

4,  5, 6, 5, 4, 3, 4, 5, 6, 5, 6,  7. 

In  this  case  the  series  ends  when  B  is  ruined  after  the  eleventh  game.  This  problem  is  studied  in  Unit  5. 

If  A  and  B  had  played  again,  the  results  of  their  games  would  probably  have 
produced  a  different  sequence  from  the  one  given  above;  it  would  have  been  a 
different  realization  of  the  sequence  of  random  variables  X] ,  X2, . . . ,  Xu.  The 
distribution  of  each  Xn  (other  than  X0)  depends  on  chance,  and  it  also  depends 
on  Xn_! .  For  example,  if  Xn_x  =  4  then  Xn  can  take  only  the  value  3  or  5.  □ 


11 


In  our  study  of  the  course  so  far  we  have  dealt  with  situations  where  some 
element  of  chance  is  involved.  In  each  case  some  feature  of  the  experiment  which 
could  be  counted  or  measured  was  abstracted.  Then,  an  associated  random 
variable  was  defined  and  its  probability  distribution  was  specified.  Consequently, 
probability  statements  could  be  made  about  various  events  associated  with  the 
random  variable,  and  its  mean  and  variance  could  be  calculated. 

Fundamentally,  we  were  concerned  with  the  values  that  a  random  variable  might 
take  either  at  a  single  observation  or  at  a  set  of  repeated  observations.  In  this 
section,  the  concept  of  a  random  process  or  stochastic  process  is  introduced. 
The  basic  difference  between  a  random  process  and  a  random  variable  is  that,  in  a 
random  process,  the  situation  being  studied  is  a  developing  one,  usually  over  time. 
In  Example  2.1  the  process  was  observed  only  at  specific  instants  of  time, 
immediately  after  a  game  had  been  completed.  The  next  example  describes  a 
process  that  is  observed  continuously  during  an  interval  of  time. 

Example  2.2 

For  two  hours,  a  record  is  kept  of  the  number  of  customers  using  a  village  shop. 
Figure  2.1  shows  the  number  of  customers  in  the  shop  at  any  time  t,  in  hours 
(0  <  t  <  2). 


Figure  2.1  The  number  of  customers  in  a  shop 

Each  rise  in  the  graph  shows  a  customer  arriving;  each  fall,  a  customer  leaving. 

At  the  start  of  the  period  of  observation  there  were  three  customers  in  the  shop, 
and  during  the  second  hour  there  was  a  period  of  about  40  minutes  when  the  shop 
was  empty. 

If  such  a  record  had  also  been  kept  at  the  same  time  on  another  day,  it  would 
almost  certainly  have  been  different  from  the  one  shown  in  Figure  2.1;  it  would 
have  been  a  different  realization  of  the  stream  of  random  variables 
{X(t)\ 0  <  t  <  2},  where  X(t)  is  the  number  of  customers  at  time  t.  On  each 
occasion  that  such  records  are  kept,  there  is  a  realization  of  a  developing 
situation:  the  number  of  customers  at  time  t. 

The  distribution  of  X(t)  for  any  fixed  t  will  obviously  depend  on  many  factors, 
such  as  the  arrival  rate  of  customers  and  the  time  a  customer  spends  in  the  shop. 
It  also  depends  on  the  starting  value.  For  example,  in  the  realization  shown  in 
Figure  2.1,  X(0)  =  3,  so  it  would  be  very  unlikely  that  X(0.01)  =  0,  i.e.  all  three 
customers  would  not  leave  in  under  a  minute.  On  the  other  hand,  if  the  shop  had 
been  empty  at  time  0,  then  it  would  be  quite  likely  that  X(0.01)  =  0.  □ 

In  the  gambler’s  ruin  example,  the  process  is  observed  only  at  specific  points  in 
time,  and  this  is  said  to  be  a  discrete- time  random  process;  whereas  the 
number  of  customers  in  the  shop  is  a  random  process  in  continuous  time. 
However,  in  both  the  examples  discussed  so  far,  the  random  variable  has  taken 
only  discrete  (non-negative  integer)  values.  In  most  of  the  processes  considered  in 


The  word  ‘stochastic’  is  derived 
from  a  Greek  word  meaning  ‘to  aim 
at'.  The  evolution  of  its  meaning  is 
uncertain,  but  it  now  means 
•pertaining  to  chance’  or  ‘random’. 
It  is  pronounced  sto-kas-tic. 


This  example  is  a  particular  case  of 
a  queueing  process ;  such  processes 
are  studied  in  Unit  9. 


12 


this  course,  this  will  be  the  case.  However,  there  are  very  many  situations  where 
the  random  variable  is  continuous,  as  is  illustrated  in  the  next  example. 

Example  2.3 

The  level  of  water  in  a  reservoir  depends:  on  supply,  in  the  form  of  rain  or  melting 
snow;  on  demand,  which  is  the  water  used  by  the  community  served  by  the 
reservoir;  and  on  other  minor  factors,  such  as  evaporation  from  the  surface.  Both 
supply  and  demand  can  be  assumed  to  be  continuous  random  variables.  A  typical 
realization  of  the  level  L(t)  of  water  in  the  reservoir  plotted  against  time  (in 
years)  might  look  like  Figure  2.2. 


Figure  2.2  A  realization  of  {L(t);t  >  0},  the  water  level 
in  a  reservoir 

Usually,  one  would  expect  the  level  to  show  a  seasonal  pattern  over  a  year.  It  will 
tend  to  reach  a  maximum  around  March  after  the  winter  rain,  and  to  fall  to  a 
minimum  in  the  autumn,  because  there  is  more  demand  (watering  gardens,  etc.) 
in  the  summer,  but  less  supply.  However,  there  will  be  variation  from  year  to 
year.  It  is  possible  that  L(t)  may  fall  to  zero  in  a  drought  or  remain  at  a 
maximum  for  a  period  of  time  when  the  reservoir  is  full  and  an  overflow  is  in 
operation.  In  this  example  the  variate  is  continuous  and  develops  over  continuous 
time.  The  distribution  of  L(t)  will  certainly  depend  on  time  (season  of  the  year), 
and  it  will  also  depend  on  the  value  of  Lit  —  St),  where  St  is  small.  Since  L(t)  is 
continuous,  the  difference  between  L(t  -  St)  and  L(t)  will  also  be  small.  □ 

The  notation  St  (read  'delta  V)  is  commonly  used  in  stochastic  processes  to 
denote  the  length  of  a  ‘short’  interval  of  time.  This  notation  is  used  in  some  of  the 
more  mathematical  parts  of  the  course  (in  Subsection  3.2  of  this  unit,  for 
instance).  There  you  will  appreciate  the  significance  of  the  time  interval  [t,t  + 

(or  [t  -  d>t,t|)  and  the  assumption  that  the  interval  is  ‘brief’.  Essentially,  it  is 
useful  to  presuppose  a  time  interval  of  such  a  short  duration  that,  at  most,  a 
single  event  or  change  of  state  occurs  within  it,  and  during  which  it  is  highly 
likely  that,  in  fact,  nothing  happens. 

You  have  seen  so  far  that  a  random  process  developing  over  time  may  change 
either  at  any  time  or  only  at  discrete  points.  You  have  also  seen  that  the  random 
variable  itself  may  take  only  discrete  integer  values  or  it  may  be  continuous.  Most 
of  this  course  will  be  devoted  to  integer- valued  variates,  but  we  shall  study 
continuous  variates  in  Unit,  14.  We  remarked  that  the  process  is  usually 
developing  over  time ,  but  it  is  also  possible  to  have  a  random  process  that 
changes  over  space. 

Example  2.4 

Figure  2.3  shows  the  variation  in  weight  per  unit  length  (essentially,  the  variation 
in  cross-sectional  area)  along  a  strand  of  wool  yarn — the  strand  will  not  be  of 
uniform  thickness.  This  is  an  example  of  a  random  process  developing  over 
distance  instead  of  time. 


Figure  2.3  Variation  of  weight  per  unit  length  along  a 
strand  of  wool  yarn 

The  length  s  along  the  yarn  corresponds  to  the  time  variable  t  that  has  appeared 
in  previous  examples,  and  is  continuous.  The  random  variable  VF(s),  weight  per 
unit  length,  is  also  continuous:  the  value  of  W  does  not  change  by  a  large  amount 
in  a  short  length.  If  the  weight  at  distance  s  is  known,  say  IT(s)  =  w,  then 
W(s  +  5s),  where  8s  is  small,  will  be  close  to  w.  From  the  realization  in 
Figure  2.3,  the  thickness  of  the  strand  appears  to  remain  approximately  constant 
over  the  length  of  yarn  observed — there  is  no  particular  trend.  □ 

In  Unit  3  we  shall  study  random  processes  in  space.  In  particular,  we  shall  meet 
examples  of  random  processes  in  two-dimensional  space,  where  a  process  develops 
over  an  area  of  land;  such  processes  arise  frequently  in  biology  and  geology. 

Here  is  another  example  of  a  random  process. 

Example  2.5 

Ten  successive  independent  observations  are  taken  on  a  random  variable 
Y  ~  N(/j,,cr2).  The  observations  form  a  single  realization  of  the  sequence 
{Yn\n  =1,2,...,  10},  where  Y \  is  the  zth  observation.  Consequently,  {Yn}  can  be 
thought  of  as  a  random  process.  This  is  not  a  very  interesting  process,  as 
~  N(p,  a2)  for  each  n  =  1,2,...,  10,  and  so  the  process  does  not  actually 
develop  over  time.  The  distributions  of  all  the  Yn  are  the  same;  the  distribution  of 
any  Yn  does  not  depend  on  n  or  on  the  realizations  of  the  previous  variates 

However,  there  are  other  more  interesting  random  variables  that  we  could  obtain 
from  this  process.  For  example,  there  are  the  ‘total  so  far’  and  the  ‘average  so  far’: 

Sn  =  Y\  +  Y2  H - 1-  Yn,  n  =  1, 2, . . . ,  10; 

An  =  —(Yi  +  y2  +  •  •  •  +  Yn )  =  — ,  n  =  1,  2, . . . ,  10. 
n  n 

Both  {SVtjn  =  1,2,...,  10}  and  {An\n  =  1,  2, . . . ,  10}  are  sequences  of  random 
variables,  and  their  distributions  can  be  obtained  directly  from  known  results  for 
the  distribution  of  the  sum  of  independent  normal  variates: 

Sn  ~  N(ny,na2)  and  An  ~  N  (y,  a2  jn)  . 

So  both  these  distributions  depend  on  n.  Furthermore,  the  Sn,  n  =  1, . . . ,  10,  are 
not  independent,  as  is  now  shown.  If  we  had  observed  Yi  =  yi,  Y2  =  V2,  ■  ■  ■ , 

Yn_i  =  yn- 1,  then  the  distribution  of  Sn  given  yi,y2,  •  ■  •  ,yn- 1  is  not  the  same  as 

the  unconditional  distribution  of  Sn.  Since  yi  H - +  yn  —  sn ,  we  can  write 

Sn  =  sn- 1  +  Yn ;  the  conditional  distribution  is  also  normal,  with  mean 

E[Sn\(yi,  y2,  •  •  • ,  yn-i)]  —  E(yi  +  y2  +  •  •  •  +  yn-i  +  Yn ) 

=  E{sn- i  +  Yn)  =  sn—  i  +  [i, 

and  variance 

V[Sn\(Vl,V2,  ■  ■  •  ,2/n-l)]  =  V(yi  +  V2  + - f  Vn-1  +  Yn) 

=  V(sn_1  +  Yn)  =  a2. 

That  is, 

Sn\(y  1,V2,  ■  ■  .,2/n-l)  ~  N(sn- 1  +  y,0-2)  - 


See  Chapters  4  and  5  of  M246. 


Here  we  have  used  the  result  that, 
if  c  is  a  constant,  then 
E{c  +  Y)  =  c  +  E{Y). 

See  Chapter  4  of  M246. 

Here  we  have  used  the  result  that, 
if  c  is  a  constant,  then 
V(c  +  Y)  =  V{Y). 

See  Chapter  4  of  M246. 


14 


Thus  the  distribution  of  Sn  depends  on  the  past  history  of  the  process;  that  is, 
the  Sn  are  not  independent.  This  remark  applies  also  to  An,  since 

Vn-l)  ~  N  (  . 

\  n  nz ) 

Figure  2.4  shows  a  typical  realization  of  the  random  process  { Yn;n  =1,2,...,  10}. 
for  the  case  Yn  ~  N(6,  9),  and  the  corresponding  realization  of 
{An;n  =  1,2, ... ,  10}. 


Here  we  have  used  the  results 
E(cX)  =  cE(X) 

V{cX)  =  c2V(X) 
for  any  random  variable  X  and 
constant  c.  See  Chapter  4  of  M246. 


When  a  continuous  random 
variable  Xn  is  observed  at  discrete 
times,  it  is  conventional  to  join  the 
points  ( n,xn )  by  straight  lines. 


0 


2 


i - 1 - r 

3  4  5 


6  7 


9 


10  » 


Figure  2.4  Realizations  of  {yn}  and  {An}  for 
n  =  1,  2, . . . ,  10,  where  Yn  ~  iV(6, 9) 

Figure  2.4  illustrates  features  that  have  already  been  noted  mathematically.  The 
graph  of  {Yn}  shows  no  particular  pattern:  these  observations  are  independent. 
The  graph  of  {An}  shows  a  rapid  settling-down  effect  consistent  with  its  variance, 
cr2/n,  which  decreases  as  n  increases.  It  is  also  apparent  that  the  value  of  any 
particular  An  depends  on  the  value  of  the  previous  variate.  -4n_i.  □ 

Example  2.5  has  demonstrated  another  feature  of  random  processes:  there  may  be 
several  sequences  of  random  variables  associated  with  a  developing  process. 

During  this  course,  many  random  processes  will  be  developed,  and  associated 
with  each  process  there  will  be  several  different  sequences  of  random  variables 
illustrating  different  features  of  each  process. 

In  this  subsection  the  concept  of  a  random  process  as  a  developing  sequence  of 
random  variables  has  been  introduced.  It  usually  develops  over  time,  though  it 
can  be  space,  and  time  can  be  either  continuous  or  discrete  integer- valued.  The 
random  variable  can  itself  take  either  continuous  or  discrete  (usually  integer) 
values.  There  may  be  several  sequences  of  random  variables  associated  with  a 
random  process;  in  general,  the  distribution  of  any  member  of  the  sequence  may 
depend  on  the  value  of  time,  and  it  may  also  depend  on  the  history  of  the  process 
up  to  that  time. 


2.2  The  Bernoulli  process 

In  this  subsection  a  familiar  random  process  is  discussed  in  more  detail. 

Consider  the  random  process  consisting  of  a  sequence  of  Bernoulli  trials.  For 
example,  this  could  be  a  sequence  of  throws  of  a  die  where  success  is  a  ‘six’;  it 
could  involve  testing  items  off  a  production  line  where  success  is  a  ‘good’  item  and 
failure  is  a  defective  item;  it  might  involve  treating  successive  patients  arriving  at 


15 


a  hospital  where  success  is  a  ‘cure’.  We  assume  that  the  trials  are  independent 
and  that  the  probability  of  success  remains  constant  from  trial  to  trial.  This 
process  is  known  as  a  Bernoulli  process. 


A  Bernoulli  process  is  the  name  given  to  a  sequence  of  Bernoulli  trials  in 
which: 

(i)  trials  are  independent; 

(ii)  the  probability  of  success  remains  the  same  from  trial  to  trial. 


For  a  Bernoulli  process,  the  idea  of  trials  occurring  in  order,  one  after  the  other,  is 
crucial. 

There  are  several  different  sequences  of  random  variables  associated  with  a 
Bernoulli  process.  The  simplest  one  is  {Yn\n  =  1,2, . . .},  where  Yn  =  1  if  the  nth 
trial  results  in  success  and  Yn  =  0  if  it  is  a  failure.  So  each  Yn  has  a  Bernoulli 
distribution  with  parameter  equal  to  p,  the  probability  of  success  at  any  trial.  In 
this  case,  the  {Yn}  are  identically  and  independently  distributed;  the  distribution 
of  Yn  does  not  depend  on  n  or  on  the  results  of  previous  trials. 

The  ‘time’  variable,  n,  is  discrete;  it  denotes  the  number  of  the  trial.  (The  real 
time  variable,  the  time  between  trials,  is  not  considered  in  a  Bernoulli  process;  it 
is  irrelevant,  for  example,  exactly  when  patients  arrive  at  the  hospital.)  The 
random  variable  itself  is  also  discrete-valued. 

A  typical  realization  of  the  process  is 

000101101110  0. 

In  this  realization,  successes  occur  at  trials  4,  6,  7,  9,  10  and  11,  and  failures  at 
the  other  trials.  In  this  realization  there  were  13  trials. 

Another  sequence  of  random  variables  associated  with  a  Bernoulli  process  is 
{ Xn ;  n  =  1,2,...}  where 

Xn  =  Y!  +  Y2  +  ---  +  Yn. 

This  sequence  specifies  the  number  of  successes  to  have  occurred  after  n  trials  are 
completed.  The  realization  of  Xn  corresponding  to  that  of  Yn  given  above  is 

000112334566  6, 

and  this  realization  is  shown  graphically  in  Figure  2.5. 


6  -  - 

5  -  - 

4 "  - 

3  -  - 

2-  - 

1 "  - 

;  :  i  i  i - 1 - 1 - 1 - 1 - 1 - 1 - 1 — ► 

0  1  2  3  4  5  6  7  8  9  10  11  12  13  « 

Figure  2.5  A  realization  of  (Xn;n  =  1,2,...,  13}  from  a 

n 

Bernoulli  process  where  Xn  =  Yt  and  Yi  ~  B(l,p) 

i= 1 

Note  that,  as  Xn  is  discrete,  the  graph  of  any  realization  appears  as  an  increasing- 
step  function. 


16 


Question  2.1 

(i)  What  is  the  unconditional  distribution  of  Xn ? 

(ii)  Specify  the  conditional  distribution  of  Xn  given  Xn-i  =  x.  □ 

Example  2.6 

In  this  example,  observers  classified  each  day  in  a  three-week  interval  as  either 
‘Wet’  or  ‘Dry’  according  to  some  rule  for  summarizing  the  weather  in  any 
particular  day.  The  21-day  sequence  went  as  follows: 

Wet  Wet  Wet  Dry  Dry  Wet  Wet  Dry  Wet  Wet  Wet  Wet  Dry  Dry  Dry  Wet 
Wet  Wet  Dry  Dry  Dry 

Scoring  1  for  a  wet  day  and  0  for  a  dry  day,  the  sequence  may  more  conveniently 
be  written  as 

11100110111100011100  0. 

This  is  a  realization  of  the  random  process  {Xn,n  =  1, 2, . . . ,  21}  where  the 
random  variable  Xn  is  discrete,  taking  the  value  0  or  1;  and  n  denotes  the  number 
of  the  day  in  the  sequence.  In  other  words.  Xn  is  a  Bernoulli  random  variable. 

The  changing  weather  from  day  to  day  may  be  well  modelled  by  a  Bernoulli 
process  if  (and  only  if!)  it  is  reasonable  to  assume  that  the  weather  on  any  day  is 
independent  of  the  weather  on  preceding  days  and  if  p,  the  probability  of  rain, 
does  not  vary  from  day  to  day. 

Taken  over  a  period  of  observation  as  long  as  a  year,  these  assumptions  would 
appear  to  fail:  there  will  be  ‘wet  spells’  and  ‘dry  spells’;  and,  in  general,  rain  is 
more  likely  in  the  winter  months  than  in  the  summer  months.  But  over  a  shorter 
period,  where  perhaps  seasonal  variation  may  be  discounted,  the  idea  of  a 
Bernoulli  process  might  provide  a  useful  model.  (Actually,  if  a  good  model  is 
required  for  representing  the  seasonal  variation  in  weather,  then  it  may  be 
necessary  to  allow  p  to  vary  with  n  in  some  really  quite  complicated  way.)  In  any 
case,  the  random  process  { Xn ;  n  =  1, 2, . . . ,  21}  is  a  discrete- valued  discrete-time 
random  process.  □ 

This  example  raises  a  crucial  point:  probability  models  of  the  sort  that  will  be 
considered  throughout  the  whole  of  this  course  are  not  usually  Tight’  or  ‘wrong’: 
they  may  be  better  described  as  adequate  or  inadequate.  A  probability  model  is 
adequate  if  it  usefully  represents  those  aspects  of  randomness  sufficient  for  the 
purposes  to  which  the  model  is  to  be  put.  If  you  have  studied  M246  then  you  will 
be  familiar  with  the  idea  of  testing  the  adequacy  of  a  model  using  data,  and  with 
goodness-of-fit  tests.  In  Unit  3  of  this  course,  more  probability  models  are 
developed  and  tested  against  data.  However,  for  the  remainder  of  the  course  we 
shall  restrict  our  attention  to  the  development  and  application  of  models,  and 
shall  not  be  concerned  with  testing  their  adequacy. 

Example  2.7 

A  gambler  who  buys  a  single  ticket  each  week  for  the  British  national  lottery  has 
a  probability  of  p  =  1/13983816  of  winning  the  jackpot  (or  a  share  of  it).  This 
probability  is  unaltered  from  week  to  week,  and  what  happens  in  any  week  is 
independent  of  what  has  occurred  in  previous  weeks.  We  can  define  the  Bernoulli 
random  variable  Xi  to  take  the  value  1  if  the  gambler  wins  the  jackpot  at  the  ith 
attempt,  and  0  otherwise.  Then  over  a  period  of  (say)  20  weeks  the  random 
process  {Xi;i  =  1,2,...,  20}  is  a  discrete- valued  discrete-time  random  process 
that  may  be  modelled  by  a  Bernoulli  process  with  parameter  p.  (In  this  case  the 
model  is  exact  because  the  two  defining  assumptions  of  the  Bernoulli  process  hold 
exactly.)  □ 


Example  2.8 

At  first  sight  a  Bernoulli  process  might  appear  to  be  a  useful  model  for  the 
sequence  of  boys  and  girls  in  a  family:  scoring  1  for  a  girl,  say,  and  0  for  a  boy. 
Then  a  family  of  five  with  four  sisters  and  a  baby  brother  might  be  represented  by 
the  sequence  11110.  The  probability  p  of  a  female  child  may  be  estimated  from 
data.  For  some  purposes  the  model  may  be  adequate:  indeed,  departures  from  the 
two  defining  assumptions  of  the  Bernoulli  process  are  typically  rather  hard  to 
detect.  But  extended  analyses  of  available  data  suggest  that  the  assumption  of 
independence  from  birth  to  birth  does  indeed  break  down.  Nature  has  a  kind  of 
memory,  and  the  sex  of  a  previous  child  afFects  to  some  degree  the  probability 
distribution  for  the  sex  of  a  subsequent  child.  □ 

Probability  models  are  developed  for  a  purpose:  to  shed  light  on  a  random 
process  and  to  answer  questions  about  it.  Depending  on  the  question  posed, 
different  random  variables  may  be  useful  in  providing  an  answer.  Several  random 
variables  associated  with  a  Bernoulli  process  are  discussed  below.  Most  of  these 
were  also  discussed  in  Subsection  2.2  of  Unit  1. 

(1)  The  outcome  of  the  ith  trial 

By  definition,  for  a  Bernoulli  process  the  outcome  of  the  ith  trial  may  be  ‘success’ 
or  ‘failure’,  with  respective  probabilities  p  and  q  =  1  -  p;  denoting  ‘success’  by  the 
event  [Xi  =  1]  and  ‘failure’  by  the  event  [X{  =  0],  then  Xt  has  a  Bernoulli 
distribution  with  parameter  p. 

( 2)  The  number  of  successes  in  a  set  or  sequence  of  n  trials 

The  number  of  successes  Y  in  n  different  trials  has  a  binomial  distribution  with 
parameters  n  and  p  :  Y  ~  J5(n,p). 

(3)  The  number  of  trials  between  consecutive  successes 

In  different  contexts,  what  matters  is  the  number  of  failures  between  consecutive 
successes,  or  the  number  of  trials  necessary  to  achieve  first  success,  or  the  number 
of  successes  before  first  failure,  and  so  on.  All  these  random  variables  follow  some 
form  of  geometric  distribution.  The  questions  are  subtly  different,  and  it  is  easy 
to  get  rather  embroiled  in  detail:  perhaps  it  is  tempting  to  try  to  memorize 
probabilities  for  each  situation.  You  should  resist  this  temptation  if  you  can: 
rather,  address  each  problem  with  a  clean  sheet  of  paper  as  it  is  presented. 

Simulation 

Just  as  it  is  useful  to  simulate  observations  on  random  variables  (as  described  in 
Unit  1 )  it  is  often  helpful  to  simulate  realizations  of  random  processes:  this  gives 
the  observer  an  idea  of  what  sort  of  behaviour  is  typical  and  what  is  unusual. 
Sometimes,  for  complex  processes,  simulation  may  be  the  only  method  for 
obtaining  some  feeling  for  likely  behaviour.  For  a  Bernoulli  process,  simulation  is 
very  easy. 

In  Section  6  of  Unit  1  procedures  are  described  for  simulating  observations  from 
discrete  distributions.  For  simulating  the  outcome  at  successive  trials  in  a 
Bernoulli  process,  it  is  necessary  to  simulate  values  from  a  Bernoulli  distribution. 

Example  2.9 

Assuming  that  over  moderately  short  intervals  (say,  up  to  about  twenty  days) 
sequences  of  wet  and  dry  days  at  a  particular  location  can  be  modelled  as  a 
Bernoulli  process  with  P(Wet)  =  0.3,  then  sequences  of  random  digits  can  be  used 
as  follows. 


Digit 

Outcome 

0,  1,  2 

Wet 

3,  4,  5,  6,  7,  8,  9 

Dry 

18 


Taking  the  first  twenty  digits  from  the  18th  row  of  random  digits  on  page  42  of 


Neave  gives  the  following  set  of  simulated  values. 

9 

2 

9 

5 

6 

0 

9 

4 

0 

1 

Dry 

Wet 

Dry 

Dry 

Dry 

Wet 

Dry 

Dry 

Wet 

Wet 

5 

8 

8 

9 

2 

5 

9 

6 

8 

6 

Dry 

Dry 

Dry 

Dry 

Wet 

Dry 

Dry 

Dry 

Dry 

Dry 

Question  2.2  Use  the  27th  row  of  random  digits  on  page  42  of  Neave  to 
simulate  the  sequence  of  boys  and  girls  in  a  family  of  six  children.  Take  the 
probability  of  a  girl  to  be  0.47. 

(Hint:  Take  the  digits  in  pairs:  25,  27,  83,  09  and  so  on.)  □ 


There  are  other  sequences  of  random  variables  associated  with  the  Bernoulli 
process.  The  one  that  is  most  likely  to  be  of  interest  is  the  number  of  trials 
required  for  each  success.  For  example,  in  a  manufacturing  process  where  success 
is  the  production  of  a  good  item  and  failure  a  defective  item,  the  manufacturer 
would  be  interested  in  how  many  items  he  had  to  produce  in  order  to  get  each 
good  item.  Let  Tk  be  the  number  of  trials  after  the  (k  -  l)th  success  up  to  and 
including  the  A;th  success.  Then  {Tk;  k  =  1,2,...}  is  another  sequence  of  random 
variables.  Now  the  ‘time’  variable,  fc,  is  the  number  of  successes,  and  the  range  of 
each  Tk  is  {1, 2, . . .}. 


Realizations  of  {Tk}  can  be  calculated  from  a  simulated  sequence  of  random 
variables  {Fn},  where  Yn  =  1  if  the  nth  trial  results  in  success  and  Yn  =  0  if  it  is  a 
failure.  For  the  realization  of  {yn}  given  by 

000101101110  0, 


This  sequence  {K(l}  was  introduced 
near  the  start  of  this  subsection. 


we  have 


=4,  T2  =  2,  T3  =  1,  T4  =  2,  T5  =  1,  T6  =  1  and  T7  >  2. 


Question  2.3 

(i)  What  is  the  distribution  of  Tk? 

(ii)  Does  the  distribution  of  Tk  depend  on  k  or  on  the  values  of  {!};?  <  &}?  □ 

2.3  Further  examples  of  random  processes 

In  this  section  the  concept  of  a  random  or  stochastic  process  has  been  introduced. 
Mathematically,  a  random  process  is  a  collection  (X(t)}  or  {AT}  of  random 
variables  defined  in  either  continuous  time  or  discrete  time.  Usually  t  is  defined 
for  an  interval  of  the  real  line,  most  commonly  — oo  <  t  <  oo  or  t  >  0,  so  we  write 
the  collection  as  {X(t)-,t  €  IR}  or  {X(t)]t  >  0).  The  range  of  t  is  called  the  time 
domain.  If  a  process  is  described  in  discrete  time,  it  is  usually  at  integer  points, 
so  we  might  write  { Xn ;  n  =  0,l,2,...}.  In  this  case,  the  time  domain  is 
{0,1,2,...}. 

The  set  of  values  that  may  be  taken  by  the  random  variables  in  a  random  process 
is  called  the  state  space  of  the  process.  A  state  space  can  be  either  discrete  or 
continuous.  In  Example  2.2,  X(t)  is  the  number  of  customers  in  the  shop  at  time 
t ,  so  the  state  space  of  {AT(t);  t  >  0}  is  {0, 1. 2, . . .};  the  state  space  is  discrete  in 
this  case.  And,  if  there  are  x  customers  in  the  shop,  the  process  {X(t);t  >  0}  is 
said  to  be  in  state  x.  In  Example  2.3,  L(t)  is  the  level  of  water  in  a  reservoir  and 
so  is  a  continuous  non-negative  variate;  the  state  space  of  {L(t)-,t  >  0}  is 
{1:1  >  0}. 

Although  a  random  process  (or  stochastic  process)  is  actually  defined 
mathematically  as  a  sequence  of  random  variables,  the  term  is  also  used  to 
describe  a  whole  physical  process. 


19 


The  remainder  of  this  section  consists  of  further  examples  of  random  processes, 
many  of  which  will  be  discussed  in  detail  later  in  the  course.  At  this  stage,  you 
are  expected  to  note  the  type  of  situation  that  may  be  modelled  by  a  random 
process,  to  become  accustomed  to  identifying  sequences  of  random  variables,  to 
recognize  whether  the  time  domain  and  the  state  space  are  discrete  or  continuous, 
and  to  start  to  use  the  notation  for  random  processes. 

Example  2.10  Machine  breakdowns 

A  machine  can  be  in  one  of  two  states:  working  or  under  repair.  As  soon  as  it 
breaks  down,  the  repairman  starts  repairing  it;  and  as  soon  as  he  finishes,  the 
machine  starts  working  again.  The  machine  occupies  these  two  states  for  times 
which  follow  exponential  distributions  with  different  means. 

What  random  processes  are  associated  with  this  model?  Suppose  that,  at  time  t, 
t>  0,  X(t)  is  a  random  variable  such  that  X(t)  =  1  if  the  machine  is  working  and 
X(t)  =  0  if  it  is  under  repair.  Then  the  sequence  {X(t);t  >  0}  is  a  random 
process  defined  in  continuous  time  and  having  the  discrete  state  space  {0, 1}  which 
describes  the  state  of  the  machine  at  time  t.  We  could  consider  the  time  between 
successive  breakdowns.  The  model  could  be  extended  to  include  several  machines, 
and  questions  could  then  be  asked  about  how  many  machines  are  working  or  how 
many  repairmen  are  required  to  prevent  a  build-up  of  broken-down  machines.  The 
model  could  include  a  third  state:  broken  and  awaiting  repair.  □ 

Example  2.11  The  card-collecting  problem 

With  every  petrol  purchase,  an  oil  company  gives  away  a  card  portraying  an 
important  event  in  the  history  of  the  petroleum  industry.  There  are  20  such  cards 
and  on  each  occasion  the  probability  of  receiving  any  particular  card  is  1/20.  For 
a  particular  customer,  one  sequence  of  random  variables  associated  with  this 
situation  is  {Xn;  n  —  1, 2, . . .},  where  Xn  =  1  if  the  card  received  at  his  nth 
purchase  is  a  new  one  for  his  collection  and  Xn  =  0  if  his  nth  card  is  a  replica  of 
one  he  has  already.  □ 

Question  2.4 

(i)  Write  down  the  distribution  of  Xn  if  the  customer  has  i  different  cards  after 
n  —  1  purchases. 

(ii)  Is  the  situation  a  Bernoulli  process? 

(iii)  Identify  two  other  sequences  of  random  variables  associated  with  the  process, 
and  state  for  each  of  them  whether  the  time  variable  and  the  random 
variable  are  discrete  or  continuous.  In  each  case,  give  the  state  space.  □ 

Example  2.12  The  weather 

Suppose  that,  at  a  particular  location,  the  weather  is  classified  each  day  as  either 
wet  or  dry  according  to  some  specific  criterion,  perhaps  wet  if  at  least  1  mm  of 
rain  is  recorded,  otherwise  dry.  It  is  well  known  that  weather  tends  to  go  in  spells 
of  wet  or  dry  and  a  possible  model  is  that  the  weather  on  any  one  day  depends 
only  on  the  weather  the  previous  day.  For  example,  if  it  rains  today,  then  the 
probability  that  it  will  rain  tomorrow  is  |  and  the  probability  that  it  will  be  dry 
is  §;  on  the  other  hand,  if  it  is  dry  today,  then  the  probability  that  it  will  be  dry 
(wet)  tomorrow  is  ^  (^).  We  could  define  the  sequence  {Xn\ n  =  0, 1, 2, . . .} 
where  Xn  =  0  if  it  is  wet  on  day  n  and  Xn  =  1  if  it  is  dry  on  day  n.  Here,  both 
time  and  Xn  are  discrete. 

The  sort  of  questions  that  arise  include  the  following.  If  it  is  wet  on  Monday, 
what  is  the  probability  that  it  will  be  wet  the  following  Thursday?  What  is  the 
long-term  proportion  of  wet  days?  If  it  is  wet  today,  how  long  will  it  be  before  the 
next  wet  day?  □ 


This  model  is  developed  in  Unit  8. 


This  is  an  example  of  a  Markov 
chain:  Unit  6  discusses  such 
models. 


20 


Question  2.5 

(i)  Although  the  model  above  can  be  thought  of  as  a  sequence  of  trials,  it  is  not 
a  Bernoulli  process.  Why? 

(ii)  What  is  the  state  space  of  the  process  {Xn\ n  =  0, 1, 2, . . .}?  □ 

Example  2.13  Family  surnames 

In  a  community,  the  surname  is  passed  down  from  generation  to  generation 
through  male  offspring  only.  Suppose  that  each  man  has  a  number  of  sons,  which 
is  a  random  variable,  taking  the  values  0, 1,  2, ... .  Each  man  reproduces 
independently  of  all  others. 

One  ancestor  (patriarch)  has  a  number  of  sons  who  form  the  first  generation;  each 
of  these  has  sons  who  form  the  second  generation;  and  so  on.  We  can  define  the 
random  variable  Xn  to  be  the  number  of  men  in  the  nth  generation,  with  X0  =  1 
denoting  the  original  ancestor.  Then  {Xn\ n  =  0, 1,  2, . . .}  is  an  example  of  a 
branching  process.  Here  the  time  variable  is  discrete  and  represents  the  generation 
number;  the  state  space  is  also  discrete  and  is  equal  to  {0, 1,2,.. .}. 

The  questions  that  are  of  interest  here  include  the  distribution  of  the  size  of  the 
nth  generation.  In  particular,  will  the  family  surname  survive,  or  will  it 
eventually  become  extinct?  □ 

Example  2.14  A  bank  statement 

The  amount  of  money  in  a  current  account  as  recorded  at  the  end  of  the  monthly 
statement  can  be  represented  by  a  sequence  {Xn;  n  =  0, 1, . . .}.  Here  n  relates  to 
the  number  of  the  statement  and  so  is  discrete;  Xn  is  the  amount  of  money  and 
can  be  conveniently  considered  as  continuous,  though  it  is  actually  an  integral 
number  of  pence.  The  initial  value,  ATo,  is  the  amount  deposited  when  the  account 
was  opened.  The  difference  Xn+i  —  Xn  may  be  made  up  of  a  monthly  salary 
cheque,  and  perhaps  some  interest  or  gifts,  less  the  various  amounts  paid  out.  The 
process  {Xn\n  =  0, 1, 2, . . .}  is  an  example  of  the  class  of  processes  with  discrete 
time  domain  and  continuous  state  space.  □ 

Example  2.15  The  bee  orchid 

Pingle  Wood  Cutting  is  a  nature  reserve  on  the  route  of  the  now  disused  Great 
Eastern  Railway  line  between  March  and  St  Ives.  It  is  owned  by  the  Bedfordshire 
and  Huntingdonshire  Wildlife  Trust,  and  bee  orchids  can  be  found  there  in  June. 
In  any  year,  the  distribution  of  the  orchid  in  the  reserve  can  be  thought  of  as  a 
random  process  in  two-dimensional  space.  Each  point  can  be  identified  as  (x,y) 
according  to  a  map  reference,  and  a  random  variable  X  is  defined  by  X(x,y)  =  1 
if  an  orchid  grows  at  that  point  and  X(x,y)  =  0  otherwise.  Then 
{X(x,y);x  eU,y  e  R}  is  a  random  process  where  the  equivalent  of  the  ‘time’ 
variable  actually  refers  to  space  and  is  two-dimensional  and  continuous.  The  state 
space  is  {0, 1}  and  so  is  discrete. 

Naturalists  might  be  interested  in  questions  such  as  the  following.  Does  the  orchid 
grow  randomly  in  the  reserve  or  does  it  favour  certain  soil  types  or  south-sloping 
land?  Does  it  grow  singly  or  in  clumps?  To  answer  such  questions,  a  model  would 
have  to  be  set-up,  and  the  data  collected  and  compared  with  the  model.  □ 


Branching  processes  are  discussed 
in  Unit  4- 


Models  for  random  processes  in 
space  are  discussed  in  Unit  3. 


21 


Example  2.16  The  price  of  wheat 


In  an  article  published  in  1953,  Professor  Sir  Maurice  Kendall  considers  wheat 
prices  in  Chicago,  measured  in  cents  per  bushel  at  weekly  intervals  from  January 
1883  to  September  1934  (with  a  gap  during  the  war  years).  A  portion  of  these 
data  is  shown  in  Figure  2.6. 


Figure  2.6  The  price  of  wheat  in  Chicago 


Source:  M.  G.  Kendall,  ‘The 
analysis  of  economic  time-series, 
Part  I:  Prices’.  Journal  of  the 
Royal  Statistical  Society ,  vol.  96, 
part  1  (1953)  pp.  11-25. 


There  is  an  overall  fall  in  the  price  over  the  two-year  period  shown  in  the  graph. 
However,  as  Kendall  reported,  first  impressions  are  ‘almost  as  if  once  a  week  the 
Demon  of  Chance  drew  a  random  number  . . .  and  added  it  to  the  current  price  to 
determine  the  next  week’s  price’.  In  other  words,  the  price  of  wheat,  Q{t),  is  a 
random  process.  Although  observed  only  once  a  week,  the  price  could  change  at 
any  time,  and  the  price  itself  varies  continuously  (though  rounded  to  the  nearest 
cent).  The  random  process  {Q(t)\t  >  0}  is  therefore  an  example  of  a  random 
process  where  both  the  time  domain  and  the  state  space  are  continuous;  it  is  an 
example  of  a  diffusion  process.  □ 

Example  2.17  The  spread  of  a  disease 


This  and  other  diffusion  processes 
are  studied  in  Unit  14. 


Suppose  that  an  infectious  disease  is  introduced  into  a  community  and  spreads 
through  it.  At  any  point  of  time,  each  member  of  the  community  may  be 
classified  as  belonging  to  just  one  of  four  categories:  healthy  but  susceptible  to 
the  disease;  having  the  disease  and  infectious;  recovered  and  immune  from  a 
further  attack;  dead.  These  categories  can  be  called  Si,  S2,  S3,  S4  respectively. 

A  person  may  pass  from  Si  to  S2  after  contact  with  someone  in  S2.  Anyone  in  S2 
will  eventually  go  to  either  S3  or  S4.  Four  sequences  of  random  variables, 

{Si(t)]t  >  0},  {S2(t);t  >  0},  {S3(t);t  >  0},  {S4(t);t  >  0},  can  be  defined  where 
Si(t)  is  the  number  of  people  in  category  Si  at  time  t.  Each  of  these  processes  is 
defined  for  continuous  time  and  the  state  space  of  each  is  discrete.  If  the  disease 
starts  in  a  community  of  size  N  with  a  single  infectious  person,  then 
Si(0)  =  N  —  1,  S2(0)  =  1,  S3(0)  —  0,  S4(0)  =  0.  If,  at  some  time  t,  S2(t)  —  0, 
then  the  disease  will  spread  no  further.  There  is  a  relationship  between  the  four 
variates:  if  no  one  enters  or  leaves  the  community,  then 
Si(t)  +  52(t)  +  <S3(i)  +  S4(t)  —  N ,  the  total  size  of  the  community. 


22 


To  develop  a  model  for  this  process,  it  is  necessary  to  specify  the  mechanics  of  the 
spread  of  the  disease,  the  probabilities  that  an  infected  person  will  recover  or  die, 
the  time  spent  in  various  stages,  etc.  □ 

Question  2.6  A  population  model 

A  colony  of  bacteria  develops  by  the  division  (into  two)  of  bacteria  and  by  the 
death  of  bacteria.  No  bacteria  join  or  leave  the  colony. 

(i)  Identify  two  random  processes  to  describe  the  development  of  this  colony, 
and  in  each  case  specify  whether  the  state  space  and  the  time  domain  are 
discrete  or  continuous.  Write  down  the  state  space  for  each  process. 

(ii)  Suppose  the  colony  starts  with  two  bacteria.  Sketch  a  possible  realization  of 
the  size  of  the  colony  over  time.  No  calculations  or  simulations  are  required, 
only  an  indication.  □ 


3  The  Poisson  process 

As  you  will  see  in  this  section,  the  study  of  events  in  time  is  of  interest  in  many 
different  areas;  much  of  this  course  will  be  spent  deriving  models  for  such 
situations  and  developing  the  properties  of  these  models.  Let  us  now  look  at  some 
data  and  try  to  isolate  any  features  or  patterns  that  are  apparent. 

We  shall  look  at  three  different  data  sets:  each  describes  the  occurrence  of  a 
recurrent  event  over  some  period  of  observation.  We  shall  then  begin  our 
construction  of  a  probability  model  which  (it  is  hoped)  will  adequately  represent 
the  inherent  variation  in  certain  types  of  random  process. 

Example  3.1  Earthquakes 

Table  3.1  contains  a  list  of  the  most  serious  earthquakes  that  have  occurred  this 
century  up  to  1977,  and  includes  the  date  and  place  of  occurrence,  the  magnitude 
of  the  earthquake  and  the  estimated  number  of  fatalities.  The  magnitude  of  an 
earthquake  is  measured  on  the  Richter  scale  and  relates  to  the  energy  released. 


This  is  an  example  of  an  epidemic 
process,  for  which  several  different 
models  are  derived  and 
investigated  in  Unit  10. 


This  is  an  example  of  what  is 
known  as  a  birth  and  death  process; 
such  processes  are  analysed  in 
Unit  8. 


23 


Table  3.1  Major  earthquakes  in  the  20th  century 


Date 

Magnitude 
(if  known) 

Region 

Estimated  number 
of  fatalities 

1902 

Dec 

16 

Turkestan 

4  500 

1905 

Apr 

4 

8.6 

India:  Kangra 

19  000 

Sept 

8 

Italy:  Calabria 

2  500 

1906 

Jan 

31 

8.9 

Colombia 

1000 

Mar 

16 

Formosa:  Kagi 

1300 

Apr 

18 

8.3 

California:  San  Francisco 

700 

Aug 

17 

8.6 

Chile:  Santiago,  Valparaiso 

20  000 

1907 

Jan 

14 

Jamaica:  Kingston 

1600 

Oct 

21 

8.1 

Central  Asia 

12  000 

1908 

Dec 

28 

7.5 

Italy:  Messina,  Reggio 

83  000 

1911 

Jan 

3 

8.7 

China:  Tien-Shan 

450 

1912 

Aug 

9 

7.8 

Marmara  Sea 

1950 

1915 

Jan 

13 

7.0 

Italy:  Avezzano 

29  980 

Oct 

3 

7.6 

California,  Nevada 

0 

1920 

Dec 

16 

8.6 

China:  Kansu,  Shansi 

100  000 

1922 

Nov 

11 

8.4 

Peru:  Atacama 

600 

1923 

Sept 

1 

8.3 

Japan:  Tokyo,  Yokohama 

143  000 

1925 

Mar 

16 

7.1 

China:  Yunnan 

5  000 

1927 

Mar 

7 

7.9 

Japan:  Tango 

3  020 

May 

22 

8.3 

China:  Nan-Shan 

200  000 

1929 

May 

1 

7.1 

Iran:  Shirwan 

3  300 

June 

16 

7.8 

New  Zealand:  Duller 

17 

1930 

July 

23 

6.5 

Italy:  Ariano,  Melfi 

1430 

1931 

Feb 

2 

7.9 

New  Zealand:  Hawke’s  Bay 

255 

1933 

Mar 

2 

8.9 

Japan:  Morioka 

2  990 

1934 

Jan 

15 

8.4 

India:  Bihar-Nepal 

10  700 

1935 

Apr 

20 

7.1 

Formosa 

3  280 

May 

30 

7.5 

Pakistan:  Quetta 

30  000 

1939 

Jan 

25 

8.3 

Chile:  Talca 

28  000 

Dec 

26 

7.9 

Turkey:  Erzincan 

30  000 

1943 

Sept 

10 

7.4 

Japan:  Tottori 

1 190 

1944 

Dec 

7 

8.3 

Japan:  Tonankai,  Nankaido 

1  000 

1945 

Jan 

12 

7.1 

Japan:  Mikawa 

1900 

1946 

Nov 

10 

7.4 

Peru:  Ancash 

1400 

Dec 

20 

8.4 

Japan:  Tonankai,  Nankaido 

1330 

1948 

June 

28 

7.3 

Japan:  Fukui 

5  390 

Oct 

5 

7.6 

Turkmenia,  Ashkhabad 

Unknown 

1949 

Aug 

5 

6.8 

Ecuador:  Ambato 

6  000 

1950 

Aug 

15 

8.7 

India,  Assam,  Tibet 

1530 

1952 

Mar 

4 

8.6 

Japan:  Tokachi 

28 

July 

21 

7.7 

California:  Kern  County 

11 

1954 

Sept 

9 

6.8 

Algeria:  Orleansville 

1250 

1955 

Mar 

31 

7.9 

Phillipines:  Mindanao 

430 

1956 

June 

9 

7.7 

Afghanistan:  Kabul 

220 

July 

9 

7.7 

Aegean  Sea:  Santorini 

57 

1957 

July 

28 

7.8 

Mexico:  Acapulco 

55 

Dec 

4 

8.0 

Mongolia:  Altai-Gobi 

30 

Dec 

13 

7.1 

Iran:  Farsinaj,  Hamadan 

1  130 

1958 

July 

10 

7.8 

Alaska,  Brit.  Columbia,  Yukon 

5 

1960 

Feb 

29 

5.8 

Morocco:  Agadir 

14  000 

May 

22 

8.5 

Chile:  Valdivia 

5  700 

1962 

Sept 

1 

7.3 

Iran:  Qazvin 

12  230 

1963 

July 

26 

6.0 

Yugoslavia:  Skopje 

1200 

1964 

Mar 

28 

8.5 

Alaska:  Anchorage,  Seward 

178 

1968 

Aug 

31 

7.4 

Iran:  Dasht-e  Bayaz 

11600 

1970 

May 

31 

7.8 

Peru:  Nr  Lima 

66  000 

1972 

Dec 

23 

6.2 

Nicaragua:  Managua 

5  000 

1974 

Dec 

28 

6.2 

Pakistan:  Pattan 

5  300 

1975 

Feb 

4 

7.5 

China:  Haicheng,  Liaoning 

Few 

1976 

Feb 

4 

7.9 

Guatemala 

22  000 

May 

6 

6.5 

Italy:  Gemona,  Friuli 

1000 

July 

27 

7.6 

China:  Tangshan 

650  000 

1977 

Mar 

4 

7.2 

Romania:  Vrancea 

2  000 

An  earthquake  is  included  if  its 
magnitude  was  at  least  7.5  or  if  a 
thousand  or  more  people  were 
killed. 


Source:  S23 7  The  Earth:  structure, 
composition  and  evolution,  Block  2. 


24 


Seismologists  would  study  these  data  with  specific  objectives  in  mind:  they  might 
wish  to  study  the  structure  of  the  Earth  or  to  predict  future  earthquakes,  for 
example.  In  this  unit  we  shall  be  concerned  with  the  times  of  occurrence. 
Accordingly,  the  first  step  is  to  calculate  the  times  (in  days)  between  successive 
earthquakes:  these  are  shown  in  Table  3.2.  (The  numbers  should  be  read  down 
the  columns.) 

Table  3.2  Times  between  major  earthquakes  (in  days) 


840 

280 

695 

402 

335 

99 

436 

83 

735 

157 

434 

294 

194 

1354 

304 

30 

832 

38 

145 

736 

562 

759 

454 

375 

384 

328 

365 

44 

584 

721 

319 

36 

567 

129 

246 

92 

33 

887 

76 

460 

667 

139 

9 

1617 

82 

121 

263 

710 

40 

40 

780 

209 

638 

220 

150 

1901 

46 

1336 

556 

203 

599 

937 

You  can  see  that  these  times,  ranging  from  1901  days  (over  5  years)  down  to 
9  days,  have  a  very  large  variance.  But  it  is  difficult  to  appreciate  a  pattern  from 
studying  a  list  of  figures.  In  order  to  develop  some  intuition  about  the  pattern, 
the  data  are  presented  in  two  ways  in  Figures  3.1  and  3.2.  In  the  first,  ‘blobs’  on  a 
time  axis  show  the  incidence  of  earthquakes  over  the  period  of 
observation.  The  second  representation  gives  a  cumulative  count  with  passing  time. 

1/1/1900 

16/12/1902  4/3/1977 

Figure  3.1  Times  at  which  major  earthquakes  occurred,  measured  in  days 


Figure  3.2  Cumulative  number  of  earthquakes  against  time 

The  trend  in  Figure  3.2  could  be  approximated  very  roughly  by  a  straight  line; 
there  is  no  pronounced  curvature.  This  implies  that  the  rate  of  occurrence  of 
earthquakes  has  remained  more  or  less  steady.  However,  both  figures  make 
evident  the  random  element  in  the  sequence  of  earthquake  times,  and  it  is  this 
element  that  we  wish  to  model.  □ 


25 


Example  3.2  Mining  accidents 

Data  are  available  on  the  dates  of  accidents  in  coal  mines  that  were  due  to 
explosions  and  in  which  there  were  ten  or  more  fatalities.  Figure  3.3  shows  the 
cumulative  number  of  such  explosions  in  coal  mines  in  Great  Britain  for  the 
period  15  March  1851  to  22  March  1962. 


Figure  3.3  Cumulative  number  of  explosions  in  coal  mines  in  Great  Britain,  in  which 
ten  or  more  miners  were  killed,  against  time 


The  pattern  in  Figure  3.3  is  different  from  that  of  the  earthquakes  (in  Figure  3.2). 
This  graph  is  roughly  linear  for  the  first  14  000  days  (up  to  about  1890),  but  the 
gradient  then  becomes  less  steep.  This  means  that  the  rate  of  occurrence  of 
accidents  decreased;  a  possible  explanation  for  this  is  that  safety  precautions  in 
coal  mines  were  improved  in  about  1890.  This  is  an  example  of  a  process  where 
the  rate  of  occurrence  of  events  is  not  constant,  but  changes  with  time.  □ 

Example  3.3  Road  accidents 

Another  example  of  an  event  in  time  is  a  fatality  in  a  road  accident.  If  we  are 
considering  such  fatalities  over  a  period  of  several  years,  the  number  of  events  is 
so  large  that  neither  of  the  diagrammatic  methods  so  far  developed  for  displaying 
the  data  is  practicable.  Instead,  we  can  use  a  third  method,  as  exemplified  by 
Figure  3.4. 


Figure  3.4  Fatalities  in  road  accidents  in  Great  Britain  from  1965  to  1985 


Source:  R.  G.  Jarrett,  ‘A  note  on 
intervals  between  coal-mining 
disasters’,  Biornetrika, 
vol.  66  (1979)  pp.  191-3. 


Source:  Annual  Abstract  of 
Statistics  (HMSO,  1985). 


26 


Figure  3.4  shows  that  the  number  of  deaths  from  road  accidents  since  1972  has 
tended  to  decline.  As  with  coal-mining  disasters,  the  rate  of  occurrence  of  events 
appears  to  alter  with  passing  time.  □ 

In  this  section,  we  discuss  a  model  for  the  occurrence  of  events  where  the  average 
rate  of  occurrence  remains  constant  over  time.  This  model  is  the  Poisson  process. 
The  model  might  be  a  suitable  one  for  the  occurrence  of  major  earthquakes,  for 
instance  (Example  3.1).  However,  it  is  not  suitable  for  modelling  either  the 
occurrence  of  coal-mining  disasters  in  Great  Britain  or  the  occurrence  of  fatalities 
in  road  accidents  (Examples  3.2  and  3.3)  since,  for  each  of  these  processes,  the 
rate  of  occurrence  of  events  changes  over  time.  A  model  for  situations  such  as 
these  is  described  in  Subsection  4.1. 


3.1  Basic  ideas  and  results 

The  Poisson  process  is  the  continuous-time  analogue  of  the  Bernoulli  process.  In 
this  process  events  occur  ‘at  random’,  but  instead  of  occurring  as  a  result  of 
regular  trials  they  can  occur  at  any  time.  The  Poisson  process  provides  a  good 
model  for  such  varied  situations  as  the  decay  of  radioactive  atoms  in  a  lump  of 
material,  the  times  of  arrival  of  customers  to  join  a  queue,  the  instants  at  which 
cars  pass  a  point  on  a  road,  and  the  times  of  goals  scored  in  a  soccer  match. 

The  discrete-time  Bernoulli  process  is  a  useful  model  when  an  event  either  occurs 
(a  success)  or  does  not  occur  (a  failure)  at  each  of  a  clearly  defined  sequence  of 
opportunities  (trials).  The  process  is  characterized  b}'  the  assumption  that  the 
probability  of  success  remains  the  same  from  trial  to  trial,  and  the  outcome  at 
any  trial  is  independent  of  the  outcomes  of  previous  trials.  In  this  sense,  the 
sequence  of  success  and  failure  is  quite  haphazard. 

Events  may  also  occur  in  a  random,  haphazard  kind  of  a  way  in  continuous  time, 
when  there  is  no  notion  of  a  ‘trial’  or  ‘opportunity’.  Some  examples  of 
unforecastable  random  events  occurring  in  continuous  time  (with  an  estimate  of 
the  rate  at  which  they  might  happen)  are: 

(a)  machine  breakdowns  on  the  factory  floor  (one  every  five  days); 

(b)  light  bulb  failures  in  the  home  (one  every  three  weeks); 

(c)  arrivals  at  a  hospital  casualty  ward  (one  every  ten  minutes  at  peak  time); 

(d)  major  earthquakes  world- wide  (one  every  fourteen  months); 

(e)  power  cuts  in  the  home  (■frequently'  in  winter,  ‘seldom’  in  summer). 

Typically,  a  realization  of  this  sort  of  random  process  is  represented  as  a  sequence 
of  blobs  drawn  on  a  time  axis,  as  in  Figure  3.5  (and  as  previously  shown  in 
Figure  3.1),  the  blobs  giving  the  times  of  occurrence. 


time 


Figure  3.5  Schematic  representation  of  a  random  sequence  of  events  in  time 


The  Poisson  process 

A  common  model  for  the  occurrence  of  random  events  in  continuous  time  is 
the  Poisson  process,  in  which  the  following  assumptions  are  made. 

(i)  Events  occur  singly. 

(ii)  The  average  rate  of  occurrence  of  events  remains  constant. 

(iii)  The  incidence  of  future  events  is  independent  of  the  past. 


So,  for  instance,  the  occurrence  of  light  bulb  failures  in  the  home  might  be  well 
(or  at  least  adequately)  modelled  by  a  Poisson  process:  these  never  (or  rarefy) 
happen  simultaneously  and  there  is  no  particular  reason  why  the  rate  at  which 
they  occur  should  vary  with  passing  time.  Perhaps  it  is  just  arguable  that  the 


incidence  of  past  events  provides  indicators  for  the  future.  (But  remember:  few 
models  are  ‘right’,  most  are  adequate,  at  best;  and  it  may  be  that  a  Poisson 
process  model  for  failures  is  adequate  for  the  purpose  of  determining,  say,  the 
stock  of  light  bulbs  to  keep  in  the  home.) 

On  the  other  hand,  the  incidence  of  power  cuts  in  the  home  would  not  be  well 
modelled  by  a  Poisson  process:  the  rate  is  greater  in  winter  than  in  summer,  so 
the  second  assumption  is  not  reasonable  in  this  case. 

There  are  two  random  variables  of  interest  for  the  Poisson  process:  the  number  of 
events  to  occur  over  any  particular  period  of  observation  (for  example, 
breakdowns  in  a  month),  which  is  a  discrete  random  variable;  and  the  time 
between  consecutive  events  (or  ‘waiting  time’  between  consecutive  events,  as  it  is 
often  called),  which  is  a  continuous  random  variable.  The  distributions  of  these 
random  variables  are  stated  in  the  box  below. 


The  Poisson  process:  two  main  results 

For  events  occurring  at  random  at  an  average  rate  A  in  such  a  way  that  their 
occurrence  may  be  modelled  as  a  Poisson  process: 

(i)  the  number  of  events  to  occur  during  a  time  interval  of  duration  t  is  a 
discrete  random  variable  X,  where  X  ~  Poisson(At); 

(ii)  the  waiting  time  T  between  consecutive  events  follows  an  exponential 
distribution:  T  ~  M( A). 


M246,  Chapter  12. 


These  two  results  are  derived  in  Subsection  3.2.  Their  use  is  illustrated  in  the 
next  two  examples. 


Example  3.4 

Over  moderately  short  intervals  of  time,  the  incidence  of  arrivals  at  a  casualty 
ward  may  usefully  be  modelled  as  a  Poisson  process  in  time  with  (on  average)  ten 
minutes  between  arrivals.  Find  the  probability  that  over  the  course  of  half  an 
hour  there  are  no  arrivals. 


Solution 


The  rate  of  the  Poisson  process  is  A  =  per  minute,  so  X ,  the  number  of  arrivals 
in  half  an  hour,  has  a  Poisson  distribution  with  parameter 

At  =  X  per  minute  x  30  minutes  =  3. 

The  probability  that  there  are  no  arrivals  in  half  an  hour  is 

P(X  =  0)  =  e~3  ~  0.0498.  □ 

Question  3.1  In  a  particular  psychological  experiment,  nerve  impulses  were 
found  to  occur  at  an  average  rate  of  458  impulses  per  second.  Assuming  a  Poisson 
process  model  for  the  incidence  of  impulses,  find  the  probability  that  over  any 
interval  of  ygo  second  not  more  than  one  nerve  impulse  occurs.  □ 


If  X  ~  Poisson(3),  then 
e-33* 


P(X  =  x)  ~ 
for  x  =  0, 1,  2, ... . 


xl 


Example  3.5 

Assuming  that  the  incidence  of  earthquakes  world- wide  may  be  adequately 
modelled  as  a  Poisson  process,  and  that  earthquakes  occur  at  a  rate  of  one  every 
fourteen  months  on  average,  find  (i)  the  probability  that  over  a  period  of  ten 
years  there  will  be  at  least  three  earthquakes  and  (ii)  the  probability  that  the 
waiting  time  between  two  consecutive  earthquakes  exceeds  two  years. 


28 


Solution 

The  rate  of  the  Poisson  process  in  this  case  is 
A  =  ~  per  month. 

(i)  So  the  expected  number  of  earthquakes  in  a  10-year  period  is 

Ai  =  n  Per  month  x  120  months  ~  8.571, 

and  the  probability  that  there  will  be  at  least  three  earthquakes  is 
P(X  >  3),  where  X  is  Poisson(8.571).  This  is  given  by 

P(X  >  3)  =  1  -  P(X  <  2) 

—  I  _  *-8.571  f  ,  o  ,  8.5712^ 

^  e  11  +  8.071-1-  )  If  X  ~  Poisson(8.571),  then 

-0.9912.  P(X  =  x)=e~am(S-571'>'° 

x\ 

(ii)  The  probability  that  the  waiting  time  between  two  consecutive  earthquakes  for  x  =  0, 1, 2, ... . 
exceeds  two  years  (that  is,  24  months)  is  P(T  >  24),  where  T  is 

exponentially  distributed  with  parameter  A  =  1/14.  The  c.d.f.  of  T  is 

F(t)  =P(T<t)  =  1  -  e~A*,  t  >  0. 

Therefore 

P(T  >t)  =  e~xt 
and  hence 

P{T  >  24)  =  e-24/14  ~  0.1801.  □ 

Question  3.2  Data  collected  on  the  major  volcanic  eruptions  in  the  northern 
hemisphere  give  a  mean  time  between  eruptions  of  29  months.  Assume  these 
incidents  occur  as  a  Poisson  process  in  time. 

(i)  Find  the  expected  number  of  eruptions  during  the  five-year  period 
January  2002-December  2006. 

(ii)  hind  the  probability  that  there  are  exactly  two  eruptions  during  this  period. 

(iii)  f  ind  the  probability  that  at  least  three  years  pass  after  you  have  read  this 
question  before  the  next  eruption.  □ 

Notation 

For  events  in  a  Poisson  process  (or.  for  that  matter,  any  continuous-time  process 
where  successive  events  are  counted)  the  number  of  events  occurring  between 
times  ti  and  t2  is  denoted  X(t1,f2)-  In  the  realization  depicted  in  Figure  3.6,  for 
instance,  the  observed  value  of  X(2,4)  is  4.  The  observed  value  of  AT(0,4)  is  7.  By 
convention,  the  random  variable  X(0,t)  is  usually  written  simply  as  X(t). 


0  1  2  3  4  5  i 

Figure  3.6  Events  in  a  Poisson  process 

In  a  Poisson  process  with  events  occurring  at  rate  A,  the  random  variable  X(t) 
has  a  Poisson  distribution  with  parameter  A t,  X(t)  ~  Poisson(Ai),  and  the 
random  variable  X(ti,t2)  Fas  a  Poisson  distribution  with  parameter  A(t2  —  tj), 
X(t1,t2)  ~  Poisson(A(f2  -  1 1)). 

For  a  given  realization  of  a  Poisson  process,  X(t)  may  be  graphed  against  t  in  the 
following  way. 


29 


X(t)  , 
8  - 

6 


4 


2  ~  - 

0  t 

Figure  3.7  Graph  of  a  realization  of  the  Poisson  process 

{*(*);*>  0} 

The  waiting  time  from  the  start  of  observation  to  the  first  event  is  conventionally 
written  T\\  the  waiting  time  between  the  first  event  and  the  second  is  written  T2; 
in  general,  the  waiting  time  between  the  (n  —  l)th  event  and  the  nth  is  written 
Tn.  This  is  illustrated  in  Figure  3.8. 


Tl 


T2 


T4 


t 


Figure  3.8  Waiting  times  between  events 

For  a  Poisson  process,  the  random  variables  Ti,T2,  . . .  ,Tn  are  independent  and 
identically  distributed  exponential  random  variables  with  parameter  A.  The 
waiting  time 

Wn  =  Ti+T2  +  ---+Tn 

from  the  start  of  observation  to  the  time  of  the  nth  event  is  the  sum  of  n 
independent  exponential  variates  with  parameter  A. 

Question  3.3  Write  down  the  distribution  of  the  random  variable  Wn.  □ 
Simulation 

A  simulated  realization  (or,  simply,  ‘simulation’)  of  events  in  a  Poisson  process  is 
most  easily  generated  by  simulating  the  sequence  of  waiting  times  between  events: 
these  are  independent  observations  on  the  random  variable  T  ~  M( A),  where  A  is 
the  rate  of  occurrence. 

One  method  of  simulating  random  numbers  from  an  exponential  distribution  is  to 
use  the  Probability-integral  Transformation:  an  observation  t  on  a  continuous 
random  variable  T  with  cumulative  distribution  function  F(t)  may  be  simulated 
by  solving  for  t  the  equation 

F(t)  =  u, 

where  u  is  a  random  observation  on  the  uniform  distribution  U( 0, 1).  When  T  is 
M( A),  this  reduces  to  solving  the  equation 

1  -  e~Xt  =  u 

for  t ,  which  gives 

t  =  -^log(l-ii). 

For  a  sequence  of  waiting  times  ti ,  •  •  •>  a  sequence  of  uniform  random  numbers 
ui,U2,  ■  ■  ■  is  required.  This  may  be  obtained  from  the  table  on  page  42  of  Neave. 


See  Unit  1.  Subsection  6.1. 


Remember  that  in  this  course  ‘log’ 
is  used  to  denote  a  logarithm  with 
base  e. 


30 


In  fact,  your  copy  of  Neave  can  save  you  most  of  the  work  involved  in  a 
simulation  of  a  Poisson  process.  As  well  as  the  list  of  uniform  random  numbers  on 
page  42,  as  you  saw  in  Unit  1  there  is  on  page  43  a  list  of  exponential  random 
numbers,  drawn  from  the  exponential  distribution  with  mean  1.  To  obtain  a 
sequence  of  waiting  times  fi,t2,  •  •  •  from  the  exponential  distribution  M( A)  (that 
is,  with  mean  A),  simply  multiply  each  term  in  a.  sequence  of  waiting  times 
ei,  e2, . . .  drawn  from  page  43  by  the  mean  A; 

1 

tj  ~  \Gj' 

Example  3.6 

It  is  required  to  simulate  the  sequence  of  events  occurring  in  a  Poisson  process 
from  the  start  of  observation  for  one  hour.  Suppose  that  events  are  occurring  at 
an  average  rate  of  one  every  10  minutes. 

Using  the  first  row  from  the  table  of  exponential  random  numbers  on  page  43  of 
Neave ,  the  simulation  may  be  set  out  as  in  the  table  below.  The  rate  A  is  -A,  per 
minute;  so  the  mean  time  between  events  is  ~  =  10  minutes. 


n 

e„ 

e 

o 

t-H 

II 

e 

HO 

Wn  =ti  + 

■  •  ■  +  tn 

1 

0.6193 

6.193 

6.193  ~ 

6.19 

2 

1.8350 

18.350 

24.543  ~ 

24.54 

3 

0.2285 

2.285 

26.828  ~ 

26.83 

4 

1.5106 

15.106 

41.934  ~ 

41.93 

5 

0.5024 

5.024 

46.958  ~ 

46.96 

6 

2.3326 

23.326 

70.284  ~ 

70.28 

The  times  in  minutes  at  which  the  first  five  events  occur  are 

Wi  =  6.19,  w2  =  24.54,  w3  =  26.83,  w4  =  41.93,  w5  =  46.96. 

As  you  can  see  from  the  table,  the  time  of  the  sixth  event  is  w6  =  70.28,  which  is 
more  than  one  hour  after  the  start  of  observation.  So  in  the  simulation  five  events 
occur  in  the  first  hour  of  observation.  □ 


Question  3.4  In  an  investigation  into  computer  reliability,  one  particular  unit 
failed  on  average  every  652  seconds.  Assuming  that  the  incidence  of  failures  may 
be  adequately  modelled  by  a  Poisson  process,  use  the  fifth  row  of  the  table  of 
exponential  random  numbers  on  page  43  of  Neave  to  simulate  one  hour’s  usage 
after  switching  the  unit  on.  Give  the  times  of  failures  to  the  nearest  second.  □ 


Source:  J.  D.  Musa,  A.  Iannino, 
and  K.  Okumoto,  Software 
Reliability:  measurement, 
prediction,  application 
(McGraw-Hill,  1978). 


3.2  A  more  formal  approach  to  the  Poisson  process 

In  Subsection  3.1,  the  distribution  of  the  number  of  events  in  a  Poisson  process 
that  occur  in  an  interval  of  length  t  and  the  distribution  of  the  time  between 
successive  events  were  stated  without  proof.  In  order  to  derive  these  results,  a 
more  formal  approach  to  the  Poisson  process  is  required  than  has  been  adopted  so 
far.  In  this  subsection,  the  assumptions  of  a  Poisson  process  are  expressed 
mathematically  as  three  postulates ;  the  postulates  are  then  used  to  derive  these 
results.  Make  sure  you  work  through  this  subsection  thoroughly  and  that  you 
understand  the  ideas  that  are  introduced  here.  The  approach  used  here  is  used 
again  later  in  the  course,  most  notably  in  Units  7  and  8. 

The  first  two  postulates  of  the  Poisson  process  express  mathematically  the 
assumptions  that  events  occur  singly  and  that  the  average  rate  of  occurrence  of 
events  remains  constant.  These  postulates  state  the  probability  of  one  event  in  a 
short  interval  and  the  probability  of  two  or  more  events  in  a  short  interval.  The 
length  of  this  short  interval  is  denoted  by  St  (read  as  ‘delta  £’).  The  first  postulate 
states  that  the  probability  of  one  event  occurring  in  any  interval  of  length  St  is 
approximately  A  St.  The  fact  that  this  probability  is  the  same  for  any  interval  of 


31 


length  St  implies  that  the  rate  A  at  which  events  occur  remains  constant  over 
time.  This  postulate  is  written  more  precisely  as  follows. 

I  The  probability  that  (exactly)  one  event  occurs  in  any  small  time  interval 
[£,  t  +  <5f]  is  equal  to  A  St  +  o(St). 


The  notation  o(St)  (read  as  ‘little-oh  of  St')  is  used  to  represent  any  function  of  St 
which  is  of  ‘smaller  order’  than  St.  Formally,  we  can  write  f(St)  —  o(St)  for  any 
function  /  of  St  such  that 

W) 


St 


— »  0  as  St  — >  0. 


(3.1) 


For  example,  (St)2  —  o(St)  since 

(St)2 


St 


=  St  0  as  St  — >  0; 


and  (St)3  is  o(St ),  since 


=  (St)2  — >  0  as  St  -»  0. 

St 

Since  the  notation  o(St)  is  used  to  represent  any  function  of  St  that  satisfies  (3.1), 
it  follows  that 

o  as  St  -»  0. 
ot 

The  second  postulate  states  formally  the  probability  that  two  or  more  events 
occur  in  a  short  interval  of  length  St.  It  is  written  as  follows. 

II  The  probability  that  two  or  more  events  occur  in  any  small  time  interval 
[t,  t  +  St]  is  equal  to  o(St). 


Essentially  this  postulate  expresses  mathematically  the  assumption  that  events 
occur  singly  in  a  Poisson  process. 


The  third  postulate  is  a  formal  statement  of  the  third  assumption  made  in 
Subsection  3.1. 

Ill  The  occurrence  of  events  after  any  time  t  is  independent  of  the  occurrence  of 
events  before  time  t. 


These  three  postulates  are  summarized  in  the  box  below. 


A  Poisson  process  is  specified  by  three  postulates. 

I  The  probability  that  (exactly)  one  event  occurs  in  any  small  time 
interval  [t,t  +  <5£]  is  equal  to  A  St  +  o(<5t). 

II  The  probability  that  two  or  more  events  occur  in  any  small  time  interval 
[t,t  +  is  equal  to  o(St). 

III  The  occurrence  of  events  after  any  time  t  is  independent  of  the 
occurrence  of  events  before  time  t. 


The  most  obvious  sequence  of  random  variables  associated  with  a  Poisson  process 
is  {X(t)]t  >  0},  where  X(t)  denotes  the  number  of  events  that  have  occurred  by 
time  t.  It  is  assumed  here  that  the  process  starts  at  time  zero,  so  AT(0)  =  0.  The 
distribution  of  X(t)  is  given  by  X(t)  ~  Poisson(Ai),  which  may  be  proved 
formally  as  follows. 

Suppose  we  start  observing  a  Poisson  process  at  time  0  and  continue  for  a  fixed 
time  t.  Then  the  number  of  events  that  occur  in  the  interval  [0,  £]  is  a  random 
variable  X(t)\  we  shall  denote  its  probability  function  by  px(t ),  so 

px(t)  =  P(X(t)  =  x). 

We  shall  begin  by  finding  p0(t)  =  P(X(t)  =  0),  which  is  the  probability  that  no 
events  occur  in  [0,  t].  We  consider  the  interval  [0 ,t]  and  a  further  short  interval 
[t,t  +  St],  and  we  then  derive  the  probability  that  no  events  have  occurred  at  the 


This  is  not  the  standard  notation 
for  a  probability  function;  we  adopt 
it  here  to  stress  that  the  probability 
is  a  function  of  t  as  well  as  of  x. 


32 


end  of  this  second  interval:  p0(t  +  St).  Using  Postulate  III,  the  occurrence  of 
events  after  any  time  t  is  independent  of  what  happened  before  time  £,  so 

Po{t  +  St)  =  P(no  events  by  time  t  +  5t) 

—  P( no  events  in  [0,  t])  x  P(no  events  in  [£,  t  +  &]). 

The  first  of  these  probabilities  is  p0(t).  The  second  probability  is  equal  to 

1  -  P(one  event  in  [£,  t  +  St])  -  P(two  or  more  events  in  [t,t  +  <5f]) 

=  1  —  (A  St  +  o(<5f))  —  o(St) 

=  1  —  A  St  +  o(St). 

You  can  see  here  one  of  the  advantages  of  the  notation  o(St):  it  allows  us  to 
simplify  expressions.  Here  we  have  been  able  to  replace  —o(St)  —  o(St)  by  o(St). 
So  we  have 


p0(t  +  St)  =  p0(t)  x  (1  -  A  St  +  o(St)), 


and  this  can  be  written 


p0(t  +  St)  -po(t) 
St 


-\p0{t)  + 


o(St) 

~sT 


Now  we  let  St  tend  to  zero;  the  left-hand  side  tends  by  definition  to  the  derivative 
of  po(t)  with  respect  to  t  and,  on  the  right-hand  side,  o(St)/St  tends  to  zero  by 
definition.  So  this  leads  to  the  differential  equation 


dpojt) 

dt 


-Xp0{t). 


(3.2) 


Equation  (3.2)  can  be  integrated  using  separation  of  variables: 


f  dPo{t) 

J  Po(t) 


J 


—A  dt. 


Performing  both  integrations  gives 


Adding  or  subtracting  any  finite 
number  of  functions  of  smaller 
order  than  5t  always  gives  a 
function  of  smaller  order  than  8t. 


Since  p0(t)  <  1,  o(5t)p0(t)  =  o(St). 


The  separation  of  variables  method 
is  described  on  page  6  of  the 
Handbook. 


logpo(t)  =  -\t  +  c. 


Now,  when  t  =  0  we  are  just  starting  to  observe  the  process,  so  no  events  have 
occurred  and 


Po(0)  =  P(no  events  in  [0,0])  =  1. 
Putting  t  =  0  gives 

log  1  =  0  -I-  c, 

and  hence  c  =  0.  So  the  solution  is 
logPo(0  =  -At 


Po{t)  =  e  xt.  (3.3) 

The  differential  equation  satisfied  by  px(t)  (x  =  1, 2, . . .)  is  derived  in  similar 
fashion: 

px{t  -t-  St)  =  P(x  events  have  occurred  by  time  t  +  St) 

—  P([x  events  in  [0,  t]  and  no  events  in  [t,t  +  <5£]] 

U  [(x  —  1)  events  in  [0,  f]  and  one  event  in  [£,  t  +  <5f]] 

U  [fewer  than  (x  -  1)  events  in  [0,f]  and  more  than  one  event  in  [£,£  +  <5t]]). 

This  is  the  union  of  three  mutually  exclusive  events,  so  we  can  add  the  separate 
probabilities: 

px{t  +  St)  =  P([x  events  in  [0,  t]\  D  [no  events  in  [£,  t  +  St]]) 

+  P([(x  -  1)  events  in  [0,  t]]  n  [one  event  in  [£,  t,  +5t]]) 

+  P( [fewer  than  (x  -  1)  events  in  [0,£]]  D  [more  than  one  event  in  [t,t  +  St]]). 


33 


This  gives 


px(t  +  St)  =  px(t)  x  (1  —  XSt  +  o(St)) 

+  Px-i{t)  x  (Aft  +  o(5£)) 

+  o(<Ji) 

=  px(t)  +  \(px-i(t)  -  px(t))6t  +  o(5t). 

Rearranging  as  before,  we  get 


px(t  +  St)  - px(t ) 
St 


=  HPx-l(t)  -px{t))  + 


o(St) 

~sT 


Letting  St  — »  0  leads  to  the  differential  equations 


dEM  =  \(px_1(t)-Vx(t)),  *=1,2,....  (3.4) 

So  we  have  now  derived  the  set  of  differential  equations  satisfied  by  the 
probability  functions  px(t).  As  you  see,  each  equation  contains  both  px(t)  and 
Px-i(t)-  So  we  solve  them  for  successively  increasing  x,  starting  with  x  =  1: 

dP^  =  Apo(t)  -  Api  (£) 


or,  substituting  for  po{t)  using  (3.3), 


dJ!f  +  XPl(t)  =  Xe-" 

This  differential  equation  can  be  solved  using  the  integrating  factor  method.  In 
this  case  the  integrating  factor  is  eXi ,  so  we  multiply  both  sides  by  ext : 

ext^^- +  X^pxit)  =  \. 

The  left-hand  side  is  the  derivative  of  the  product  ext  x  p\(t).  So  we  can  rewrite 
the  differential  equation  as 

|(eVW)  =  A 

and  integrating  we  obtain 
eXtpi(t)  =  X  t  +  c. 


When  t  =  0,  no  events  have  occurred,  so  pi(0)  =  P(X(0)  =  1)  =  0  and  hence 
c  =  0.  So 


Pi(t)  =  Xte  xt. 

The  differential  equation  (3.4)  with  x  =  2  can  now  be  solved.  The  technique  is 
exactly  the  same  as  with  x  =  1,  but  this  time  you  need  to  use  the  value  of  pi(t). 


Question  3.5  Solve  the  differential  equation  (3.4)  to  derive  an  expression  for 

p2{t).  □ 


We  could  continue  in  this  way  to  find  p3(£),  then  p4(t),  and  so  on;  but  if  you  look 
at  the  results  we  have  so  far,  you  may  recognize  them  as  probabilities  from  a 
familiar  distribution.  Remember  from  Unit  1  that  the  probability  function  of  the 
Poisson  distribution  with  parameter  p,  is 

f  e~^px 

px(x)  =  P{X  =  x)  =  ~x\  x  —  0,1,2, . . . 

I  0  otherwise. 

Putting  p  =  Xt  you  can  see  that  po(t),  px(£)  and  p2(t)  are  just  the  first  three 
probabilities  in  a  Poisson  distribution  with  parameter  Xt.  So  we  shall  conjecture 
that  X{t)  has  a  Poisson  distribution  with  parameter  Xt.  This  can  be  either  proved 
by  induction  or  verified  by  substitution  in  the  general  differential  equation.  We 
use  the  second  of  these  methods. 


Again,  we  use  the  brevity  of  the 
interval  [t,  t  +  St]  to  attach 
probabilities  to  the  three 
possibilities  that  nothing  happens, 
one  thing  happens,  or  more  than 
one  thing  happens. 


The  integrating  factor  method  is 
described  on  page  6  of  the 
Handbook. 


34 


If  Px(t)  = 


.-At 


(\ty 


xl 


then 

.-At, 


d  .  .  —\e~xt(Xt)x  e~xtXxxtx~1 


dt 


Px(t)  = 


xl 


=  ~>^Px(t)  + 


+  x! 

\t(  \  4.\X— 


t(xty 


(*-!)! 

=  -Apx(£)  +  Apx_i(i), 
which  is  the  differential  equation  (3.4). 

So  the  Poisson  distribution  is  the  solution  to  the  set  of  differential  equations  (3.4) 
with  initial  values  po(0)  =  1,  px(0)  =0  for  x  —  1,2, ... . 


The  random  variable  X  ( t )  representing  the  number  of  events  occurring  by 
time  t  in  a  Poisson  process  has  a  Poisson  distribution,  with  parameter  A t. 
That  is,  X(t)  ~  Poisson(At)  and 

p,(t)  =  P(I(l)  =  I)  =  ^,  *  =  0.1..... 

x! 


Because  A,  the  rate  of  occurrence  of  events,  is  constant,  this  result  holds  for  any 
time  interval  of  length  t\  it  does  not  matter  when  we  start  observing  the  process. 

You  have  seen  that  the  other  quantity  that  is  often  of  interest  in  a  Poisson 
process,  and  indeed  in  many  random  processes,  is  the  time  between  successive 
events.  You  have  also  seen  that  the  time  from  the  start  of  the  process  until  the 
first  event  is  denoted  Tx  and  the  time  between  the  (k  —  l)th  and  the  kth  events  is 
denoted  7\..  Each  T k  is  referred  to  as  an  inter-event  time.  Again  this  is 
analogous  to  {7&}  for  the  Bernoulli  process,  though  in  that  case  7*  is  a  discrete 
variate;  here  it  is  a  continuous  one. 

The  distribution  of  T\  can  be  derived  using  the  distribution  of  X(t).  Since  Ti 
exceeds  t  if  and  only  if  no  events  occur  in  [0 ,t],  we  have 

P(Ti  >  t)  =  P(no  events  occur  in  [0 ,t]) 

=  P{X{t)  =  0) 

=  e~xt,  since  X(t)  ~  Poisson(At). 

Hence,  if  T\  has  c.d.f.  F(t).  then 

F(t)  =  P{Ti  <  t)  =  1  -  P(TX  >  t), 

so  the  c.d.f.  of  7\  is 

F(t)  =  1  -  e-Af,  t  >  0, 
and  the  p.d.f.  of  Tf  is 

f(t)  =  Xe~xt,  t  >  0. 

Hence  Tx  has  the  exponential  distribution  with  parameter  A  :  T\  ~  M( A). 

Because  of  the  memoryless  property  of  the  exponential  distribution,  observation 
can  start  at  any  stage  of  the  process:  the  time  at  which  the  previous  event 
occurred  is  irrelevant.  Whether  it  occurred  immediately  before  observation 
started  or  a  long  time  before,  the  time  until  the  next  event  always  has  an 
exponential  distribution.  Also,  the  distribution  of  each  inter-event  time  Tk  for 
k  =  1, 2, ...  is  exponential  with  parameter  A. 


35 


The  time  Ti  from  the  start  of  observation  in  a  Poisson  process  to  the  first 
event  has  an  exponential  distribution  with  parameter  A. 

For  k  =  2,3,...,  the  time  Tk  between  the  ( k  -  l)th  event  and  the  kth  event 
has  an  exponential  distribution  with  parameter  A. 

That  is, 

Tfe~M( A),  *  =  1,2,...  . 


This  result  provides  the  simplest  method  of  simulating  observations  from  a 
Poisson  process.  Independent  observations  from  M( A)  can  be  used  to  simulate  the 
inter-event,  times. 

Many  of  the  processes  to  be  studied  in  this  course  are  extensions  of  the  Poisson 
process.  We  consider  models  for  events  occurring  in  time,  by  changing  the 
postulates  in  various  ways.  The  Poisson  process  provides  the  basis  for  many  of  the 
models  discussed  in  Units  7-11. 


3.3  The  multivariate  Poisson  process 

A  multivariate  Poisson  process  is  a  Poisson  process  in  which  each  of  the  sequence 
of  events  may  be  just  one  of  several  different  types  of  event.  For  instance,  in 
traffic  research,  an  event  may  correspond  to  the  passage  of  a  vehicle  past  a  point 
of  observation  by  the  side  of  a  road.  The  vehicles  may  themselves  be  classified  as 
private  cars,  lorries,  buses,  motorcycles,  and  so  on.  So  there  are  several  types  of 
event,  and  each  event  in  the  process  is  of  just  one  type. 

In  general,  the  multivariate  Poisson  process  may  be  described  as  follows.  Suppose 
that  events  occur  as  a  Poisson  process  in  time  at  rate  A,  and  that  each  event  is 
one  of  k  types.  Let  the  probability  that  an  event  is  of  type  i  be  p,:,  where 
k 

5>i  =  i. 

i=l 

and  suppose  that  occurrences  of  different  types  are  independent  of  each  other. 
Then  the  process  is  a  multivariate  Poisson  process. 

It  is  not  difficult  to  show  that  events  of  type  i  occur  as  a  Poisson  process  in  time 
at  rate  A pt.  Consider  the  occurrence  of  events  of  type  i  in  the  small  interval 
[t,t  +  ft]: 

P(one  event  of  type  i  occurs  in  [£,  t  +  ft]) 

=  P(one  event  occurs  in  [£,  t  +  ft]  and  it  is  of  type  i) 

+  P(more  than  one  event  in  [t,  t  +  ft]  and  one  of  them  is  of  type  i ) 
—  P(one  event  in  [i,  t  +  ft])  x  P(an  event  is  of  type  i)  +  o(5t) 

—  (A ft  +  o(ft))  x  pi  +  o(ft) 

=  A  pidt  +  o(ft). 

So 

P(one  event  of  type  i  occurs  in  [£,  t  +  ft])  =  Xprft  +  o(ft). 

This  is  Postulate  I  for  a  Poisson  process  with  rate  A  pi.  The  other  two  postulates 
are  also  satisfied  by  the  process  of  events  of  type  i.  So  events  of  type  i  occur  as  in 
a  Poisson  process  with  rate  A as  stated. 

It  follows  that  the  multivariate  Poisson  process  is  built  up  of  k  independent 
Poisson  processes  with  rates  A pl5 . . . ,  \pk. 

A  typical  realization  of  a  multivariate  Poisson  process  is  shown  in  Figure  3.9. 


This  result  was  used  in 
Example  3.6. 


36 


Figure  3.9  Events  in  a  multivariate  Poisson  process 

Example  3.7 

Suppose  that  vehicles  pass  an  observer  standing  at  the  edge  of  a  main  road 
according  to  a  Poisson  process  at  an  average  rate  of  100  vehicles  per  hour.  Of 
these  vehicles  a  proportion  of  0.6  are  cars,  0.3  are  lorries,  0.08  are  coaches  and 
0.02  are  motorcycles. 

(i)  At  what  rate  do  coaches  pass  the  observer? 

(ii)  What  is  the  probability  that  in  a  ten-minute  interval  more  than  four  lorries 
pass  the  observer? 

(iii)  What  is  the  average  waiting  time  between  successive  motorcycles? 

Solution 

Writing  the  proportions  of  the  four  types  of  vehicle  respectively  as  pi  =  0.6, 

P2  —  0.3,  p3  =  0.08,  p^  =  0.02,  and  given  A  =  100  per  hour,  we  have 

Ai  =  Api  =  60  per  hour,  A2  =  A p2  =  30  per  hour, 

A3  =  Ap3  =  8  per  hour,  A4  =  Ap4  =  2  per  hour. 

(i)  Passenger  coaches  pass  the  observer  at  rate  A3  =  8  per  hour. 

(ii)  The  expected  number  of  lorries  in  10  minutes  is  A2/6  =  30/6  =  5.  The 
probability  that  more  than  four  lorries  pass  is  P(L  >  4),  where 

L  ~  Poisson(5);  that  is, 

1  -  P(L  <  4)  =  1  -  e"5  (^1  +  5  +  -  +  -  +  -  J  ~  0.5595. 

(iii)  The  waiting  time  between  successive  motorcycles  has  an  exponential 
distribution  with  parameter  A4  =  2.  So  the  average  waiting  time  between 
successive  motorcycles  is 

—  =  \  hour  =  30  minutes.  □ 

A4  2 

Question  3.6  Customers  arrive  at  a  bank  according  to  a  Poisson  process  at  an 
average  rate  of  ten  per  minute.  A  proportion  of  0.6  of  customers  simply  wish  to 
draw  out  money  (type  A),  a  proportion  of  0.3  wish  to  pay  in  money  (type  B)  and 
a  proportion  of  0.1  wish  to  cam-  out  a  more  complicated  transaction  (type  C). 

(i)  What  is  the  probability  that  more  than  five  customers  arrive  in  a  time 
interval  of  duration  30  seconds? 

(ii)  What  is  the  probability  that,  in  one  minute,  six  customers  of  type  A  arrive? 

(iii)  What  is  the  probability  that,  in  one  minute,  six  customers  of  type  A,  three 
of  type  B  and  at  least  one  of  type  C  arrive?  □ 

An  alternative  description  of  a  multivariate  Poisson  process  is  the  following. 

Events  of  type  1  occur  as  a  Poisson  process  in  time  at  rate  A4;  events  of  type  2 
occur  as  a  Poisson  process  in  time  at  rate  A2, . . .,  events  of  type  n  occur  as  a 
Poisson  process  in  time  at  rate  An.  If  it  may  be  assumed  that  the  processes  are 
occurring  independently  of  one  another,  and  if  the  events  are  superimposed  on  the 
same  time  axis,  then  the  overall  sequence  of  events  occurs  as  a  Poisson  process  at 
rate 

n 

A  =  Ai  +  A2  +  •  ■  •  +  An  =  Aj. 

i=  1 


37 


In  any  such  superimposed  realization,  the  probability  that  an  event  is  of  type  i  is 
given  by 

_  A*  _  Aj 

A  Ai  +  A2  H - (-  An 

Question  3.7  Over  the  course  of  an  evening,  a  university  tutor  has  noticed  that 
telephone  calls  arrive:  from  students  according  to  a  Poisson  process  at  an  average 
rate  of  one  every  90  minutes;  independently  from  members  of  her  family  according 
to  a  Poisson  process  at  an  average  rate  of  one  every  three  hours;  and 
independently  from  friends  according  to  a  Poisson  process  at  an  average  rate  of 
one  call  per  hour.  She  receives  no  other  calls. 

(i)  What  is  the  probability  that,  between  9  pm  and  11pm  tomorrow  evening, 
her  telephone  will  not  ring? 

(ii)  What  is  the  probability  that  the  first  call  thereafter  is  from  a  student?  □ 


4  Extensions  of  the  Poisson  process 

In  this  section  two  extensions  of  the  Poisson  process  are  described  and  their 
properties  summarized.  The  first  band  of  the  video-cassette  is  associated  with 
Subsection  4.1,  so  you  may  wish  to  watch  it  during  your  study  of  this  subsection. 


4.1  The  non-homogeneous  Poisson  process 

You  have  seen  several  examples  of  events  occurring  at  random  in  continuous  time 
where  a  Poisson  process  provides  an  adequate  model.  In  other  cases  the  model 
may  fail,  and  one  situation  where  this  happens  is  when  the  average  rate  A  at 
which  events  occur  may  not  reasonably  be  assumed  to  be  constant  with  passing 
time.  Two  examples  where  this  is  the  case  were  described  at  the  beginning  of 
Section  3:  the  rate  of  occurrence  of  serious  coal-mining  accidents  in  Great  Britain 
decreased  after  about  1890  (Example  3.2);  and  the  rate  of  occurrence  of  fatalities 
in  road  accidents  appears  to  have  changed  between  1965  and  1985  (Example  3.3). 
Several  more  examples  are  described  below. 

Example  4.1  Learning  to  ride 

A  young  girl  learning  to  ride  a  bicycle  has  accidents  at  random.  However,  as  she 
improves,  her  rate  of  accidents  decreases  and  so  the  Poisson  process  does  not 
piovide  an  adequate  model.  We  need  a  model  in  which  the  rate  decreases  with 
time.  One  possibility  is  to  modify  the  Poisson  process  so  as  to  keep  the  second 
and  third  postulates  but  to  change  the  first.  Suppose  that  the  probability  that  an 
accident  occurs  in  the  interval  [t,  t  +  <fe]  is  [24/(2  +  t)]6t  +  o{St),  where  t  is 
measured  in  days.  The  rate  24/(2  + 1)  now  decreases  with  time  so  that,  when  the 
girl  starts  to  learn  (t  =  0),  she  will  have  an  accident  once  every  two  hours  on  the 
average,  but  after  10  days  (t  =  10)  the  rate  has  dropped  to  two  per  day  and  after 
a  year  it  is  only  about  one  per  fortnight.  Accidents  still  occur  independently  of 
each  other,  and  the  rate  at  time  t  is  independent  of  how  many  accidents  the  girl 
has  had  before  time  t. 

We  might  be  interested  in  the  number  of  accidents  over  some  specified  time 
interval,  or  the  distribution  of  the  time  until  the  next  accident  after  some  given 
time.  Two  random  processes  associated  with  this  model  are  {X(t);t  >  0},  where 
AT (£)  is  the  number  of  accidents  by  time  f,  and  {Tk-,k  =  1,2,.. .},  where  each  Tk  is 
an  inter-event  time,  defined  exactly  as  for  the  Poisson  process.  □ 


The  postulates  are  on  page  32. 


38 


Example  4.2  Accident  and  emergency  unit 

Suppose  that  it  is  required  to  model  arrivals  at  an  accident  and  emergency  unit  in 
a  hospital.  Experience  from  monitoring  actual  arrivals  over  a  long  period  suggests 
that  in  fact  there  is  a  daily  cycle,  and  that  the  unit  is  more  busy  at  some  times 
than  at  others.  These  differences  are  quite  perceptible,  and  a  Poisson  process  for 
modelling  arrivals  is  deemed  to  be  insufficiently  accurate.  In  fact,  arrivals  may  be 
regarded  as  occurring  at  a  more  or  less  constant  rate  from  (say)  6  am  to  6  pm,  but 
start  to  rise  thereafter  until  2  am,  after  which  they  are  rather  sparse  until  6  am.  If 
we  denote  the  arrival  rate  at  time  t  by  A (t),  then  a  graph  of  A (t)  against  t  (the 
average  arrival  rate  against  passing  time)  may  look  something  like  that  in 
Figure  4.1. 


Figure  4.1  Arrival  rate  at  an  accident  and  emergency 
unit  over  24  hours  □ 

A  large  number  of  subtle  and  important  aspects  of  modelling  are  raised  by  the 
preceding  paragraphs.  For  instance,  when  and  why  might  a  simple  model  be 
deemed  ‘insufficiently  accurate’  ?  What  estimation  techniques  might  be  used  to 
formulate  an  alternative  model?  What  considerations  arise  when  attempting  to 
choose  between  several  different  models,  and  what  sort  of  formal  tests  might  be 
adopted  to  assist  in  that  choice?  Only  in  the  next  unit  does  this  course  deal  with 
this  sort  of  question,  but  you  ought  always  to  remember  that  rarely  is  a  model 
‘right’  or,  for  that  matter,  'wrong'  models  are  simply  more  or  less  ‘adequate’, 
and  that  adjective  raises  many  qualitative  judgements  which  may  be  made  only 
when  it  is  known  to  what  use  the  model  is  to  be  put. 

Example  4.3  Mistakes  during  learning 

In  any  learning  situation  (such  as  a  child  learning  to  ride  a  bicycle,  or  an  adult 
learning  to  perform  some  complex  task,  such  as  text-processing  or  bricklaying) 
there  will  be  an  initial  period  during  which  many  accidents  will  happen  and 
mistakes  will  be  made;  then,  as  the  learner  becomes  more  proficient,  mistakes  still 
occur  (haphazardly — that  is,  at  random)  but  at  a  rate  that  is  lower  than  in  the 
initial  stages.  Figure  4.2  shows  a  possible  realization  of  events,  their  incidence 
becoming  sparser  with  passing  time  (but  notice  that  there  is  no  regular  pattern  to 
the  events:  it  is  in  the  nature  of  accidents  and  mistakes  that  they  are  unexpected 
and  unforecastable) . 


time 


Figure  4.2  Realization  of  a  random  process:  mistakes  during  learning 


The  next  figure  shows  a  hypothetical  general  form  for  the  accident  rate  A(£)  in  a 
learning  process— say,  riding  a  bicycle.  Initially,  accidents  (falling  off,  hitting 
obstacles)  may  occur  as  frequently  as  ten  or  twelve  times  a  day  (on  average), 
reducing  after  a  week  to  once  or  twice  a  day,  reducing  eventually  to  (say)  once  or 
twice  a  year  for  an  experienced  cyclist. 


Figure  4.3  Accident  rate  during  learning 

In  some  cases  it  may  be  appropriate  to  reduce  the  rate  of  events  to  zero,  with 
passing  time.  □ 

It  may  help  you  to  see  what  is  going  on  to  contrast  these  ideas  with  the  simple 
Poisson  process  of  Subsections  3.1  and  3.2.  In  Figure  4.4  a  typical  realization  of 
events  is  depicted.  Of  course,  by  the  very  nature  of  random  events,  the  incidence 
of  events  at  some  times  is  busier  than  at  other  times,  but  a  consideration  of  the 
underlying  process  does  not  provide  any  reason  for  proposing  a  real  variation  in 
the  underlying  rate,  which  is  shown  unvarying  in  Figure  4.5. 


Figure  4.4  A  typical  realization 
Mt)  A 


A 


Figure  4.5  The  underlying  rate 

The  non-homogeneous  Poisson  process,  in  which  the  rate  changes  with  time, 
is  defined  by  three  postulates.  As  has  already  been  suggested,  the  second  and 
third  postulates  are  the  same  as  for  a  simple  Poisson  process.  The  first  postulate 
is  a  modification  of  the  first  postulate  of  the  simple  Poisson  process  to  allow  the 
rate  of  occurrence  of  events  to  vary  with  time. 

I  The  probability  that  (exactly)  one  event  occurs  in  the  small  time  interval 
[£,  t  +  <)T]  is  equal  to  A (t)6t  +  o(6t),  for  each  t. 

The  function  A (t)  can  take  any  form,  for  example  linear,  quadratic  or 
trigonometrical,  provided  A(£)  >  0.  The  behaviour  of  a  non-homogeneous  Poisson 
process  is  illustrated  on  the  video-cassette  for  several  different  functions  A(£). 


40 


This  is  a  suitable  point  at  which  to  watch  Band  A  of  the  video-cassette. 

The  distribution  of  X(t) 

It  is  relatively  straightforward  to  derive  the  probability  distribution  of  the  random 
variable  X(0,£)  =  X(t),  denoting  the  number  of  events  to  occur  in  a 
non-homogeneous  Poisson  process  over  the  time  interval  (0 ,£];  all  that  is  required 
is  an  extension  to  the  argument  of  Subsection  3.2.  But  it  is  the  result  that  is 
important,  not  its  derivation,  so  we  shall  be  content  here  merely  to  state  it. 


1<I>I»I 


If  X(t)  is  the  number  of  events  occurring  in  a  non-homogeneous  Poisson 
process  with  rate  A(£)  during  the  interval  (0,£],  then  X(t)  has  a  Poisson 
distribution  with  parameter  /i(£), 

X(t )  ~  Poisson (/r(£)), 

where 

yu(£)  =  /  A (u)du. 

Jo 


Since  the  mean  of  a  Poisson  distribution  is  equal  to  its  parameter,  we  have  the 
following  result. 


The  expected  number  of  events  in  the  interval  (0,  t]  is 

E[X(t)]  =  fi(t)  =  f  X(u)du. 

Jo 

Since  /i(£)  is  found  by  integrating  the  rate  A(£),  it  can  be  represented  by  an  area 
under  the  graph  of  A(£).  This  is  illustrated  in  Figure  4.6. 

The  number  of  events  in  the  time  interval  (£i,£2]  (where  £2  >  £i)  is 


Figure  4.6  The  expected 
number  of  events  in  the  time 
interval  (0,  t] 


X(ti  ,£2)  —  X(t2)  —  X(ti), 


and  so 


E[X(t1,t2)]  =  E[X(t2)\-E[X(t1)} 
=  v(t2) 


The  mean  of  X(ti,t2)  is  denoted  p(£i,t2),  so  we  can  write 

rt2 

fj{ti,t2)  =  n(t2)  -  jz(ii)  =  /  A (u)du. 

Jti 

This  result  is  illustrated  in  Figure  4.7. 

The  number  of  events  occurring  in  (ti,t2\  has  a  Poisson  distribution  with  mean 
l*(h,t2): 

X(ti,t2)  ~  Poisson  (n(ti,  t2)). 


Figure  4.7  The  expected 
number  of  events  in  ( t\ ,  t2] 


This  result  is  summarized  in  the  box  below. 


The  number  of  events  X{ti,t2)  occurring  in  the  time  interval  (ti,t2)  (where 
t2  >  ti)  in  a  non-homogeneous  Poisson  process  wdth  rate  A(t),  t  >  0,  has  a 
Poisson  distribution  with  mean 

fi(ti,t2)  =  n{t2)  —  p(£i). 


41 


Example  4.4 


Suppose  that  the  hourly  error  rate  for  a  laboratory  rat  learning  its  way  round  a 
maze  is  given  by 


A(t)  =  8e~\  t  >  0. 

This  is  illustrated  in  Figure  4.8. 

(i)  Find  the  expected  number  of  errors  the  rat  makes  during  its  first  t  hours  in 
the  maze. 

(ii)  Find  the  expected  number  of  errors  during  the  first  hour  in  the  maze. 

(iii)  Find  the  probability  that  in  its  second  hour  in  the  maze  the  rat  makes  more 
than  two  mistakes. 

(iv)  Find  the  probability  that  after  three  hours  in  the  maze  the  rat  never  makes 
another  mistake. 


Solution 


(i)  A  good  approach  in  any  problem  involving  a  non-homogeneous  Poisson 

process  is  to  begin  by  finding  an  expression  for  /x(t),  whether  or  not  this  is 
asked  for  explicitly.  This  can  then  be  used  to  calculate  values  of  M*i), 
m(^1j^2)  for  specific  values  of  t\ ,  £2 •  In  this  case  /x(£)  is  required  to  answer 
the  first  part  of  the  question. 


The  expected  number  of  errors  during  the 
time  interval  (0,£],  is  given  by 

M*)  =  f  A(-u)  du  =  f  8e~udu  = 


Jo  Jo 

The  expected  number  of  errors  in  the  first 
Ml)  =8(1  -e"1)  ~  5.057. 


first  t  hours,  that  is  during  the 

[-8e-“]‘  =  8(1  -  e-*) . 
hour  is 


This  is  illustrated  in  Figure  4.9. 

(iii)  The  expected  number  of  errors  the  rat  makes  during  its  second  hour  in  the 
maze  is  given  by 


Figure  4.8  The  function 
A(£)  =  8 e~\t  >  0 


Figure  4.9  The  expected 
number  of  errors  in  the  first  hour, 

Ml) 


Ml,  2)  =  M2)  -  Ml)  =  8(1  -  e-2)  -  8(1  -  e"1) 

=  8(e-1  -  e-2)  ~  1.860. 

This  is  illustrated  in  Figure  4.10. 


So  the  number  of  errors  the  rat  makes  in  the  second  hour  has  a  Poisson 
distribution  with  mean  1.86.  The  probability  of  more  than  two  mistakes  is 


P(X(1,2)>2)  =  1-P(X(1,2)<2) 


-  1  -  e"1'86 
~  0.285. 


^1  +  1.86  + 


Figure  4.10  The  expected 
number  of  errors  in  the  second 
hour,  /i(l,  2) 


(iv)  The  expected  number  of  mistakes  made  by  the  rat  in  the  period  after  having 
spent  three  hours  in  the  maze  is  given  by 

M3,  00)  =  Moo)  -  M3)  =  8  -  8(1  -  e-3)  =  8e-3  ~  0.398. 

This  is  illustrated  in  Figure  4.11. 

The  number  ol  errors  the  rat  makes  in  the  period  after  having  spent  three 
hours  in  the  maze  has  the  distribution  Poisson(/i(3,  00))— that  is, 
Poisson(0.398).  So  the  probability  that  no  errors  are  made  is 

g— 0.398  _  q  g72_ 

Notice  that  this  calculation  assumes  that  the  laboratory  rat  lives  forever, 
and  remains  in  the  maze  throughout.  This  sort  of  assumption  is  not  unusual 
in  a  modelling  context  where  it  leads  to  a  simplification  of  the  model.  In  this 
example,  the  mean  number  of  mistakes  in  the  period  after  10  hours  in  the 


Figure  4.11  The  expected 
number  of  errors  after  time  3, 
M3, 00) 


42 


maze  is  only  about  0.0004,  and  the  mean  number  of  mistakes  in  the  period 
after  100  hours  in  the  maze  is  negligible  (/r(100,oo)  =  8e-100  ~  3  x  10-43). 
So,  the  model  suggests  that  all  mistakes  will  be  made  within  the  first 
100  hours  in  the  maze — well  within  the  lifetime  of  the  rat.  Hence  the 
assumption  that  the  rat  lives  forever  does  not  invalidate  results  obtained 
from  the  model.  □ 

Question  4.1  In  Example  4.1,  the  accident  rate  at  time  £  of  a  young  girl 
learning  to  ride  a  bicycle  is  given  by 

24 

m  =  5^,  *  >  o, 

where  £  is  measured  in  days. 

(i)  Find  the  expected  number  of  accidents  in  the  first  £  days. 

(ii)  Find  the  expected  number  of  accidents  during  the  first  week. 

(iii)  Find  the  expected  number  of  accidents  during  the  third  week. 

(iv)  Find  the  probability  that  week  4  is  entirely  free  of  accidents. 

Note  that  this  model  makes  the  rather  unrealistic  assumption  that  the  novice 
rider  does  nothing  else  during  the  learning  period.  □ 

Waiting  times 

The  distribution  of  X  (£)  can  be  used  to  obtain  the  distribution  of  the  waiting 
time  T\  until  the  first  event.  The  derivation  makes  use  of  the  fact  that  the  waiting 
time  to  the  first  event  exceeds  £  if  and  only  if  there  are  no  events  in  (0 ,£]:  [T\  >  t] 
and  [X(t)  =  0]  are  equivalent  events.  (We  used  this  equivalence  in  Subsection  3.2 
to  derive  the  distribution  of  7\  in  a  simple  Poisson  process.)  It  follows  that 

P(T i  >t)  =  P(X(t)  =  0). 

Since  X(t)  ~  Poisson(^(£)), 

P(X{t)  =  0)  =  e-^>, 

and  so 

P{Ti  >t)  = 

The  c.d.f.  of  Tx  is 

FTl(t)  =  P(T1<t)  =  l-P(Tl>t). 

So  the  c.d.f.  of  the  waiting  time  until  the  first  event  is  given  by 

FTl(t)  =  1-e-^.  £  >  0.  (4.1) 

By  a  similar  argument,  we  can  use  the  distribution  of  X(ti,t2)  to  find  the 
distribution  of  the  time  T  from  the  start  of  observation  until  the  next  event  after 
some  given  time  v.  We  use  the  fact  that  the  time  T  exceeds  £  if  and  only  if  there 
are  no  events  in  the  interval  (v,£].  So 

P(T  >  t)  =  P{X(v,t)  =  0). 

Since  X(v,t)  ~  Poisson(/i( v.t) ).  we  have 
P(T  >t)  =  e~fl,v't  . 

and  hence  the  c.d.f.  of  the  time  T  at  which  the  next  event  after  time  v  occurs  is 
given  by 

Ft(£)  =  P(T<t)  =  1  -  e~^v,t\  t  >  v.  (4.2) 

We  shall  shortly  be  using  Results  (4.1)  and  (4.2)  to  simulate  the  times  of  events  in 
a  non-homogeneous  Poisson  process.  But  first,  try  the  next  question. 


Question  4.2  Suppose  that  events  in  a  non- homogeneous  Poisson  process  occur 
at  the  rate 


This  is  Result  11.5  from  page  5  of 
the  Handbook. 


Simulation 

The  times  of  occurrence  of  events  in  a  non-homogeneous  Poisson  process  can  be 
simulated  using  the  Probability-integral  Transformation  and  Results  (4.1)  and 
(4.2). 

Recall  that  observations  on  a  continuous  random  variable  X  may  be  simulated  by 
solving  the  equation  F(x )  =  u  for  x ,  where  u  is  a  uniform  random  number.  So  if 
FTl(t)  is  the  c.d.f.  of  T),  the  time  at  which  the  first  event  occurs,  then,  given  a 
uniform  random  number  u,  we  can  generate  a  random  observation  U  on  T\  by 
solving  the  equation  FTl{t\ )  =  u  for  tL.  That  is,  using  Result  (4.1),  by  solving 

1  _  e-M(*i)  =  u_ 

This  gives 

e-M*i  )  =  1_u, 

or  equivalently 

A*(*i)  =  -log(l  -  u). 

This  equation  can  then  be  solved  for  ti  to  give  the  simulated  time  of  occurrence 
for  the  first  event. 

If  we  assume  that  the  j th  event  occurs  at  time  tj,  the  time  tj+ 1  of  the  ( j  +  l)th 
event  may  be  simulated  as  follows.  Since  Tj+\  is  the  time  at  which  the  next  event 
occurs  after  time  tj,  putting  v  =  tj  in  Result  (4.2)  gives  the  c.d.f.  of  Tj+1: 

FTj+1(t )  =  1  -  =  i  _  e-[M(t)-M(t;)]_ 

We  can  generate  a  random  observation  tj+l  on  Tj+1  by  solving  for  tj+1  the 
equation 

1  _  e— [m(*j+i)— M(b)l  — 

where  u  is  a  random  observation  from  t/(0, 1).  This  gives 
e-[A»(b-+i)-M(b)l  =  i  _ 

or  equivalently 

H(tj+1)  -  fl(tj)  =  -log(l  -u), 
which  gives 

p(tj+1)  =  ii{tj)  -log(l  -u). 


A (t)  =2 1,  t  >  0. 

(i)  Find  the  expected  number  of  events  to  occur  by  time  t. 

(ii)  If  observation  starts  at  time  t  =  0,  find  the  probability  P(T'i  >  t )  for  the 
waiting  time  Ti  until  the  first  event. 

(iii)  Find  the  expected  waiting  time  until  the  first  event,  E(T\). 

Hint:  Use  the  result 


(iv)  If  T  is  the  time  from  the  start  of  observation  until  the  first  event  after  time 
w  occurs,  find  P(T  >  t).  □ 


44 


These  results  for  simulating  the  times  at  which  events  occur  in  a 
non-homogeneous  Poisson  process  are  summarized  in  the  box  below. 


Simulation 

Given  a  random  observation  u  from  17(0. 1),  the  simulated  time  of 
occurrence  of  the  first  event  in  a  non-homogeneous  Poisson  process  is  £x, 
where  t\  is  the  solution  of 

A*(*i)  =  ~log(l  -  u).  (4.3) 

Suppose  the  jth  event  occurs  at  time  tj.  Then,  given  a  random  observation 
u  from  U(0, 1),  the  simulated  time  at  which  the  (j  +  l)tli  event  occurs  is 
tj+ 1,  where  tj+1  is  obtained  from 

ritj+i)  =  v(tj)  -log(l  -  u).  (4.4) 


Notice  that  if  j  is  set  equal  to  0  in  (4.4)  and  if  to  is  defined  to  be  equal  to  0, 
then  (4.4)  can  be  used  as  the  formula  to  generate  the  simulated  time  t\  as  well  as 
all  subsequent  times  t2,t3,. . . . 


Example  4.5 

In  Example  4.4,  it  was  shown  that  the  expected  number  of  errors  a  laboratory  rat 
makes  during  its  first  t  hours  in  a  maze  is  given  by 

/i(t)  =  8(l-e“*). 

We  shall  use  the  random  numbers  ux  =  0.96417,  u2  =  0.63336,  u3  =  0.88491 
(taken  from  the  sixth  row  of  the  table  on  page  42  of  Neave)  to  simulate  the  times 
at  which  the  rat  makes  its  first  three  mistakes. 

Using  (4.3),  the  simulated  time  t\  at  which  the  first  mistake  is  made  is  the 
solution  of 


#*(*i)  =  —  log(l  -  Mi), 

that  is 

8(l  —  e~tl)  =  -log(l  -0.96417)  ~  3.3290. 
Therefore 

„  3.3290 

e  tl  =  1 - t —  =  0.583875, 


8 


giving 


t\  ~  0.53807  hours  ~  32  minutes. 

Now  we  can  use  (4.4)  to  find  the  simulated  time  t2  for  the  rat’s  second  mistake: 
fi{t2)  =  /1(G)  -  log ( 1  -  u2)  =  3.3290  -  log(l  -  0.63336)  ~  4.3324. 

Solving 

8(1  -e~t2)  —  4.3324 

leads  to 


t2  ~  0.77990  hours  ~  47  minutes. 

Using  (4.4)  again,  we  can  find  the  simulated  time  t3  for  the  rat’s  third  mistake: 
H{t3)  =  fj,(t2)  -  log(l  -  u3)  -  4.3324  -  log(l  -  0.88491)  ~  6.4944. 

Solving 

8(1  -e~t3)  =  6.4944 

leads  to 


t3  ~  1.67025  hours  ~  1  hour  40  minutes. 


So  m  this  simulation  the  rat  makes  his  first  three  mistakes  32  minutes,  47  minutes 
and  1  hour  40  minutes  after  entering  the  maze.  (Note  that  these  are  the  times  at 
which  the  mistakes  are  made;  they  are  not  inter-event  times.  The  notation 
Ti,T2i T3i...  is  usually  reserved  for  the  inter-event  times  in  a  random  process;  its 
use  here  is  an  exception  to  this  convention.)  □ 

*  Question  4.3  In  the  non-homogeneous  Poisson  process  of  Question  4.2  events 
occur  at  the  rate 

A(£)  =  2 1,  £  >  0. 

Simulate  the  times  tx,  t2,  t3  at  which  the  first  three  events  occur  in  a  realization 
of  the  process,  using  the  uniform  random  numbers  ui  =  0  622  u2  =  0  239 
w3  =  0.775.  □  ’ 


The  calculations  involved  in  simulating  the  times  of  events  in  a  non-homogeneous 
Poisson  process  can  sometimes  be  simplified  by  solving  Formula  (4.4) 
algebraically  to  obtain  a  recurrence  relation  for  the  simulated  times  £1} £2, £3, . . . . 
This  is  illustrated  in  the  next  example. 


Example  4.6 

Occasionally  a  machine  malfunctions,  resulting  in  the  production  of  defective 
items.  The  incidence  of  these  errors  may  be  modelled  as  a  non-homogeneous 
Poisson  process  in  continuous  time,  with  mean  rate  at  time  £  given  by 

m  =  rh’ 

Therefore  the  mean  number  of  events  in  the  time  interval  (0,  £]  is  given  by 

p(<)  =  l  x{v)  dv  =  f0TT^dv=  K  (1  +  *2)]‘ 

=  log  (1  +  £2)  -  log(l)  =  log  (1  +  £2)  . 

The  times  of  machine  malfunctions  can  be  simulated  using  Formula  (4.4): 

Afe+i)  =  V(tj)  ~  log(l  -  u ). 

I11  this  case,  this  gives 

log  (1  +  £j+1)  =  log  (1  + 1))  -  log(l  -  u). 

Rather  than  using  this  directly,  we  shall  rearrange  it  to  give  tj+1  explicitly,  as 
follows.  First  taking  exponentials,  we  obtain 


Therefore 


t: 


3  +  1 


=  1  +  tj  _  -I  =  (x  +  *i)  ~  (x  -  u)  _  t2j  +  U 

l-u  1  -u  ~  1-u 


and  so 


tj+ 1  = 


ij+u 


We  shall  use  this  recurrence  relation  to  simulate  the  occurrence  of  malfunctions 
during  the  interval  (0,3].  Using  random  numbers  from  row  22  of  the  table  on 


46 


Retaining  full  calculator  accuracy 
throughout  leads  to  1 4  —  5.8339. 


Question  4.4  Show  that,  for  the  non- homogeneous  Poisson  process  of 
Questions  4.2  and  4.3.  with  rate  A(£)  =  2£,  t  >  0.  a  simulation  scheme  is  provided 
by  the  recurrence  relation 

tj+i  =  yjt'j  —  log(l  —  u). 

Hence  simulate  the  times  £1, . . . ,£4  of  the  first  four  events  in  a  realization  of  the 
process,  using  the  uniform  random  numbers 

ui  =  0.927,  u2  =  0.098,  u3  =  0.397,  u4  =  0.604.  □ 

4.2  The  compound  Poisson  process 

One  of  the  assumptions  of  the  Poisson  process  is  that  events  always  occur  singly 
and  there  are  no  multiple  occurrences.  In  this  subsection  we  shall  extend  the 
Poisson  process  to  the  situation  where  multiple  occurrences  are  allowed.  You  met 
an  example  of  this  sort  of  situation  at  the  beginning  of  Section  3:  the  occurrence 
of  major  earthquakes  may  be  modelled  by  a  Poisson  process,  each  earthquake 
resulting  in  a  number  of  fatalities.  Several  more  examples  are  described  below. 

Example  4.7 

Buses  arrive  at  a  shopping  centre  according  to  a  Poisson  process  at  an  average 
rate  of  one  bus  each  minute.  (So  the  number  of  bus  arrivals  X(t)  over  an  interval 
of  observation  of  duration  £  hours,  say,  has  a  Poisson  distribution  with  mean  60£.) 
Suppose  that  the  number  of  passengers  that  disembark  from  each  bus  is  an 
independent  observation  on  the  discrete  random  variable  Y:  there  will  often  be 
more  than  one  passenger  disembarking  from  a  bus.  Then  over  the  interval  of 
observation  the  total  number  of  disembarking  passengers  is  given  by 

S(t)  =  Y\  4-  V2  -4 - b  Yx(t)i 

where  Yi  is  the  number  of  passengers  to  disembark  from  the  zth  bus.  □ 

Example  4.8 

Private  cars  call  in  at  a  roadside  restaurant  according  to  a  Poisson  process  at  an 
average  rate  of  one  car  every  four  minutes  from  12  noon  to  2  pm.  The  number  of 
passengers  (including  the  driver)  in  each  car  is  an  observation  on  the  discrete 
random  variable  Y  with  geometric  distribution  Gi(0.4).  If  X(2)  is  the  number  of 
cars  that  arrive  in  the  two-hour  period,  then  the  total  number  of  visitors  to  the 
restaurant  over  the  two-hour  period  is  given  by 

5(2)  =  Yi  +  Y2  4 - 1-  YX( 2), 

where  X{2)  has  a  Poisson  distribution  with  parameter  30  (A  =  15  private  cars  per 
hour,  on  average,  and  £  =  2  hours)  and  each  Yi  has  the  given  geometric 
distribution.  □ 


page  42  of  Neave  (0.28000,0.44301, . . .)  gives  the  following  simulated  times: 
/02  +  0.28000 

h  =  \l- - rTvccr  ^  0.6236; 


£2  = 
£3  = 

£4  = 


1  -  0.28000 
0.62362  +  0.44301 
1  -  0.44301 
1.22212  +  0.40028 
1  -  0.40028 
1.77702  +  0.88132 


1.2221: 


~  1.7770: 


~  5.8338. 


1  -  0.88132 

The  fourth  event  occurs  after  time  3.  and  so  there  are  three  malfunctions  in  (0, 3] 
in  this  simulation.  □ 


47 


Example  4.9 

Events  occur  in  time  according  to  a  Poisson  process  with  rate  A;  however,  each 
event  has  only  a  probability  p  of  being  detected  and  recorded,  independently  of 
whether  other  events  are  recorded.  Thus  associated  with  each  event  is  the  random 
variable  Y  taking  the  value  1  if  the  event  is  recorded,  0  if  it  is  not.  If  X(t)  is  the 
number  of  events  that  occur  in  a  time  interval  of  length  f,  then  the  number  of 
events  recorded  over  the  time  interval  is  given  by 

S(t)  =  Y1  +  Y2  4 - 1-  YX(t ), 

where  Y{  ~  B(l,p)  and  X(t)  has  a  Poisson  distribution  with  parameter  At.  □ 

In  these  examples,  events  occur  according  to  a  Poisson  process,  and  associated 
with  each  event  is  another  event  for  which  multiple  occurrences  are  possible.  The 
number  of  occurrences  associated  with  each  event  in  the  Poisson  process  is  an 
observation  on  a  discrete  random  variable  Y.  Such  a  process  is  called  a 
compound  Poisson  process.  Each  event  of  the  Poisson  process  is  referred  to  as 
a  compound  event.  If  the  number  of  compound  events  to  occur  in  (0,t]  is 
denoted  by  X(t)  and  the  number  of  occurrences  associated  with  the  ith 
compound  event  is  denoted  by  Yi:  then  5(f),  the  total  number  of  occurrences  of 
events  to  occur  in  the  interval  (0 ,f],  is  given  by 

S(t)  =  Yi  +  Y2-\ - 1-  YX(t). 

In  Unit  1 ,  several  results  for  sums  of  random  variables  were  given.  However,  these 
results  apply  only  when  the  number  of  random  variables  in  the  sum  is  known,  so 
they  cannot  be  applied  to  obtain  information  about  5(f):  5(f)  is  the  sum  of  X(t) 
random  variables  and  X  ( t )  is  itself  a  random  variable,  so  its  value  is  not  known. 

In  Unit  4  you  will  see  how  to  determine  the  probability  distribution  of  5(f)  using 
probability  generating  functions.  At  this  stage  of  the  course  we  shall  be  content 
with  finding  expressions  for  the  mean  and  variance  of  5(f).  In  the  derivation  that 
follows,  we  shall  be  making  use  of  some  of  the  ideas  from  Section  1  concerning 
conditional  probability  and  conditional  expectation.  You  should  try  to  follow  the 
derivation,  but  you  will  not  be  expected  to  reproduce  it. 

We  shall  denote  the  mean  and  variance  of  Y,  the  number  of  occurrences 
associated  with  each  compound  event,  by  p  and  a2  respectively,  so  that 

E(Y)  =  p,  V  (Y)  =  a2. 

The  mean  of  S(t) 

The  mean  of  5(f)  is  given  by 

OO 

£[S(t)]  =  X>P(S(i)  =  S). 

s= 0 

Since  X(t)  must  take  just  one  of  the  values  x  =  0, 1 . . by  the  Theorem  of  Total 
Probability, 

OO 

P(5(t)  =s)  =  J2  P{S{t)  =  s\X{t)  =  x)P(X(t)  =  x). 

x=0 

Thus  we  can  write 

PC  OO 

E[S(t)]  =  Y,  s  p(s(t)  =  S|*(f)  =  x)P{X{t)  =  x). 

s=0  x=0 

Interchanging  the  order  of  summation  on  the  right-hand  side  gives: 

OO  OO 

E[S(t)]  =  ]T  P(X ( t )  =  x)  ]T  sP(S(t)  =  s|X(i)  =  *).  (4.5) 

x=0  s=0 


48 


Now  the  sum  over  s  is  simply  the  expected  value  of  S(t)  when  X  (t)  takes  the 
particular  value  x,  that  is  it  is  the  conditional  expectation  £[S(f)|X(t)  =  x}.  So 
we  can  write  Equation  (4.5)  as 

oo 

S[S(t)]  =  £>(*(<)  =  x)E[S(t)\X(t)  =  »].  (4.6) 

x=Q 

When  X(t)  is  equal  to  x,  S(t )  is  the  sum  of  a  fixed  number,  x ,  of  random 

variables  Yf.  S(t)  =  Y\  +  Y2  H - 1-  Yx.  Since  the  Y{  (i  =  1, . . . ,  x)  are  identically 

distributed  with  mean  /i,  the  expected  value  of  Yi  +  I2  +  •  •  •  +  Yx  is  x/j,.  That  is, 

£'[S'(t)|X(t)  =  x]  =  xfi. 

Substituting  this  in  Equation  (4.6)  gives 

oc  OO 

=  55  p(x(t)  =  x)xfJ  =  »Y,xP(X(t)  =  x)  =  fj,E[X(t)]. 

x=0  x=0 

Since  X  ( t )  is  the  number  of  events  occurring  in  time  t  in  a  Poisson  process  with 
rate  A,  AT(t)  ~  Poisson(At)  and  so  E[X(t)]  =  A t.  Thus  we  have  the  following 
result:  the  expected  number  of  events  in  an  interval  of  length  t  is 

E[S{t)}  =  tiXt. 

You  might  have  expected  this  result  intuitively,  since  A t  compound  events  are 
expected  in  the  interval  (0 ,t\,  and  p  is  the  mean  number  of  events  associated  with 
each  compound  event. 

The  variance  of  S(t) 

The  variance  of  S(t)  can  be  calculated  using  a  similar  method. 

First  note  that  the  variance  of  any  random  variable  X  is  given  by 
V(X)  =  E(X 2)  -  (£J(X))2; 

and  so 

E(X2)  =  V(X)  +  (E(X))2.  (4.7) 

The  variance  of  S(t)  is  found  from 
E[(5(t))2]-(£-[5(i)])2. 

We  have  already  found  an  expression  for  £[(S(t)],  so  it  remains  to  find 
^'[(‘^(O)2]'  Just  asi  Equation  (4.6).  we  wrote  i?[5(t)]  in  terms  of  the 
conditional  expectations  E[S(t)\X(t)  =  x],  so  we  can  write  £7 [(S’(i))2]  in  terms  of 
the  conditional  expectations  £[(S(*))2|X(i)  =  x].  A  similar  argument  to  that 
used  to  obtain  Equation  (4.6)  leads  to 

■®[(S(i))2]  =  Ep(X(t)  =  z)£[(S(t))2|X(i)  =  x]  . 

x=0 

Now,  when  X(t)  =  x,  S{t)  =  Yx  +  Y2  H - +  YX,  so 

£[(S(t))2|X(t)  =  *]  =  s[(u  +  y2  +  •  +  X)2] 

=  E[Y?  +  (XX  +  XX  +  ■  •  ■  +  XX) 

+  X2  +  (XX  +  XX  +  •  •  ■  +  XX) 

+  X^2  +  (XX  +  XX  +  ■  •  ■  +  XX-i)]  ■ 

In  this  sum  there  are  x  terms  of  the  form  E(Yf)  and  x(x  -  1)  terms  of  the  form 
E{YjYj)  where  i  j.  Since  Yt  and  Yj  are  independent  with  mean  /i,  it  follows 
that  E(YiYj)  =  E(Yi)E(Yj)  =  / 1 2.  Also,  since  E{Yi)  =  V  and  V(Y-)  —  a2,  using 
Equation  (4.7)  it  follows  that 

E{Y?)  =  cr2  +  fi2 . 


49 


Therefore 


E{(S(*))2\X(t)  =  x]  =  xE(Y?)  +  x(x  -  l)E(YiYj) 

=  x (a2  +  p2)  +  x{x  -  l)p2 
=  xa2  +  x2  p2 . 

•  Using  this  result,  we  have 

oo 

^(W]  =  J>(x(t)  =  ^)s[(S(t))2|x(t)  =  x] 

2=0 

oo 

=  £p(X(t)  =  z)  (za2+xV) 

i=0 

oc  oo 

=  <t2Y  *p(x  (0  =  x)  +  li2Y,  x2  P{X{t)  =  x) 

x=0  x=0 

=  a2E[X(t)]  +  ^E[X(tf] 

=  +  P(V[X(t)]  +  (Ppf  (t)])2) . 

Since  X(t)  ~  Poisson(At)  we  know  that  B[X(t)]  =  V[X(t)\  =  At;  so 
B[(5(t))2]  =<r2At  +  M2(At  +  (At)2). 

Hence  we  have 

V[S(t)]  =  E[(S(t))2]-(E[S(t)})2 

=  a2 \t  +  n2  (A t  +  (A t)2)  -  (pX £)2 
=  o-2 At  +  /rAt  +  fj2(\t)2  -  (p\t)2 
=  Xt(a2  +  A*2)  • 

These  results  for  a  compound  Poisson  process  are  stated  in  the  box  below. 


If  S{t )  is  the  number  of  events  to  occur  in  (0,  t]  in  a  compound  Poisson 
process  in  which  compound  events  occur  at  the  rate  A,  then 

£[S(t)]  =  pX  t,  (4.8) 

V[S(t)]  =  Xt(cr2  +  n2)  ,  (4.9) 

where  p  and  a2  are  the  mean  and  variance  of  the  number  of  events 
associated  with  each  compound  event. 


Example  4.10 


In  Example  4.8,  the  number  of  passengers  in  each  car  calling  at  a  roadside 
restaurant  is  an  observation  on  a  geometric  random  variable:  Yt  ~  G,(0.4).  So 

H  =  E(Yi)  =  E  =  2.5, 

"2  =  V(Yi)  =  -  3-75' 

Cars  arrive  according  to  a  Poisson  process  with  rate  A  =  15  per  hour  during  the 
two-hour  period  from  12  noon  to  2  pm.  Using  Results  (4.8)  and  (4.9)  with  A  =  15 
and  t  =  2,  the  mean  and  variance  of  5(2),  the  total  number  of  customers  to  the 
restaurant  over  the  two-hour  period,  are 

£[5(2)]  =  2.5  x  15  x  2  =  75, 

U[5(2)]  =  15  x  2  x  (3.75  +  2.52)  =  300.  □ 


The  mean  and  variance  of 
X  ~  Gi(p)  are  1/p  and  q/p 2 
respectively. 


50 


Question  4.5  Suppose  that  shoppers  leave  a  small  village  shop  according  to  a 
Poisson  process  at  an  average  rate  of  one  every  five  minutes.  Independently  of  one 
another,  each  shopper  makes  Y  purchases,  where  Y  has  the  following  probability 
distribution. 


y 

0  12  3 

II 

0.2  0.3  0.4  0.1 

(i)  Find  the  mean  and  variance  of  Y. 

(ii)  Hence  find  the  mean  and  variance  of  the  total  number  of  purchases  made 
over  a  three-hour  period.  □ 

Question  4.6  Suppose  that  road  accidents  follow  a  Poisson  process  with  rate  A 
and  that  Yi:  the  number  of  casualties  in  the  ith  accident  (i  =  1,2,...),  has  a 
geometric  distribution  5  ~  Go  (a),  0  <  a  <  1.  Determine  the  mean  and  variance 
of  the  number  of  casualties  to  occur  over  a  time  interval  of  duration  t.  □ 

Question  4.7  Suppose  that  events  occur  according  to  a  Poisson  process  at  a 
rate  A  and  that  each  event  has  a  probability  p  of  being  detected. 

(i)  Determine  the  mean  and  the  variance  of  the  number  of  events  detected  in 
the  time  interval  0.  :]. 

(n)  From  your  answers  to  part  (i),  can  you  suggest  the  probability  distribution 
of  the  number  of  detected  events?  How  could  you  have  known  this?  □ 

Postscript:  a  note  on  expectation 

A  useful  generalization  >f  Result  (4.6),  which  is  used  on  several  occasions  in  later 
units,  is  as  follows. 

For  any  two  discrete  rand  im  variables  X  and  Y,  the  mean  of  X  can  be  written  in 
terms  of  the  conditional  probabilities  E(X\Y  =  y ): 

e(x)  =  Y.  E  x  Y  =  y)pV  =  y )•  (4.io) 

The  expression  on  the  nzht-hand  side  of  Formula  (4.10)  is  of  the  form 

E  S(V)P(Y  =  y  : 

y€&Y 

so  it  is  the  expectation  :*f  a  function  g(Y )  of  Y.  This  function  is  usually  denoted 
E(X\Y),  so  Formula  '4.1  may  be  written  succinctly  as 

E(X)  =  Ey ( E  X  Y  !.  (4.H) 

The  subscript  Y  is  included  -  make  it  clear  that  the  expectation  is  of  a  function 
of  T. 


5  Point  processes 


Random  processes  that  consist  of  events  occurring  in  time  are  known  as  point 
processes.  The  models  that  we  studied  in  Sections  3  and  4,  all  of  which  were 
based  on  the  Poisson  process,  are  all  examples  of  point  processes.  In  this  section 
point  processes  are  discussed  in  rather  more  general  terms.  In  Subsection  5.1  a 
number  of  possible  types  of  model  which  are  not  based  on  a  Poisson  process  are 
described  briefly.  In  Subsection  5.2  a  method  of  assessing  point  processes 
mathematically  is  introduced. 


The  following  notation  is  standard  for  point  processes. 

X(t)  is  the  number  of  events  occurring  in  the  time  interval  (0,f]. 

X(ti,£2)  is  the  number  of  events  occurring  in  the  time  interval  (£i,t2]. 

Tn  is  the  waiting  time  between  the  (n  —  l)th  event  in  the  sequence  and  the  nth. 
Wn  is  the  waiting  time  to  the  nth  event  in  the  sequence. 


5.1  Types  of  point  process 

One  type  of  point  process  is  known  as  a  ‘renewal  process’.  In  such  a  process,  {Xi}, 
the  times  between  consecutive  events,  are  identically  and  independently 
distributed,  but  the  distribution  can  take  any  form.  This  type  of  process  gets  its 
name  from  the  practical  situation  where  components,  like  light  bulbs,  spark  plugs, 
etc.,  wear  out  and  fail  and  are  then  immediately  replaced  or  renewed.  Thus,  at 
each  renewal  the  process  starts  again,  and  the  lifetimes  of  components  are 
independent.  Since  the  replacement  components  are  identical,  the  lifetimes  all 
have  the  same  distribution.  In  general,  in  a  renewal  process,  the  probability  that 
an  event  occurs  in  the  small  time  interval  [t,  t  +  ft]  depends  on  the  time  at  which 
the  previous  event  occurred.  The  Poisson  process,  with  exponentially  distributed 
inter-event  times,  is  the  only  renewal  process  where  the  memoryless  property 
applies. 

Another  set  of  possible  models  for  point  processes  arises  when  the  probability 
that  an  event  occurs  in  a  specific  time  interval  depends  on  the  number  of  events 
that  have  occurred  previously.  A  model  of  this  type,  the  simple  birth  process ,  is 
discussed  briefly  in  Section  6  of  this  unit.  This  type  of  process  arises  very 
frequently:  such  processes  will  be  studied  in  Units  7,  8,  9,  10  and  11 ,  though  in 
these  units  other  properties  of  the  processes  will  be  investigated,  as  well  as  the 
times  at  which  events  occur. 

A  third  set  of  models  consists  of  those  which  describe  point  processes  where 
events  can  occur  only  at  discrete  time  points.  The  simplest  such  model  is  that  in 
which  events  occur  at  regular  intervals.  Another  such  model  is  the  Bernoulli 
process,  which  is  the  discrete  analogue  of  the  Poisson  process  and  has  many  of  the 
same  properties.  In  practice,  point  processes  in  discrete  time  occur  much  less 
frequently  than  do  those  in  continuous  time. 


5.2  The  index  of  dispersion 

When  comparing  different  models  for  point  processes,  the  most  convenient 
features  to  consider  are  the  distribution  of  the  number  of  events  in  a  fixed  time 
interval  and  the  distribution  of  the  times  between  events.  In  this  subsection  we 
shall  discuss  only  the  distribution  of  the  number  of  events,  because  this  is  often 
easier  to  calculate.  To  begin  with,  two  functions  are  defined:  the  mean-time 
function  and  the  variance-time  function.  These  are  simply  the  mean  and  the 
variance  of  X(t),  the  number  of  events  occurring  in  (0,  £],  where  t  >  0.  The 
mean-time  function  is  denoted  by  p(t),  so 

p(t)  =  E[X(t)), 

and  the  variance-time  function  by  cr2(t),  so 
a2(t)  =  V[X(t)\. 

Note  that  for  a  Poisson  process  with  parameter  A,  X(t)  ~  Poisson(At),  so 
p(t)  =  cr2(t)  =  At. 


Renewal  processes  are  studied  in 
detail  in  Unit  13. 


Random  processes  in  discrete  time 
are  the  subject  of  Units  4 ,  5  and  6, 
though  these  are  not  considered  as 
point  processes. 


Recall  that  X(t)  is  short  for 
X(0,£). 


52 


This  equality  means  that  cr2 (t) / —  1.  For  other  point  processes  the  value  of 
the  ratio  a2  (t)  /  may  be  different  from  1,  and  this  fact  can  be  used  to  compare 
any  point  process  with  a  Poisson  process.  For  a  point  process  the  index  of 
dispersion,  denoted  I(t),  is  defined  by 


ti(t)  ' 

If  events  in  a  point  process  A  have  mean-time  function  equal  to  A t  but  occur  in  a 
more  regular  fashion  than  in  the  Poisson  process,  then  process  A’s  variance-time 
function  is  smaller  than  A t  and  so  the  index  of  dispersion  for  process  A  is  less 
than  1.  On  the  other  hand,  if  there  is  more  dispersion  in  a  process  B  than  in  the 
Poisson  process,  so  that  very  long  and  very  short  intervals  are  more  likely  to  arise 
than  in  a  Poisson  process,  then  the  index  of  dispersion  for  process  B  is  greater 
than  1. 

As  you  saw  in  Subsection  3.3,  the  multivariate  Poisson  process,  which  consists  of 
events  of  different  types  each  occurring  as  a  Poisson  process,  is  just  a  Poisson 
process  with  rate  A*.  Hence,  for  this  process  the  index  of  dispersion  is  equal 
to  1. 

Question  5.1  Calculate  the  index  of  dispersion  for  a  non-homogeneous  Poisson 
process  {X(t)-,t  >0}.  □ 


In  Subsection  4.2  we  discussed  the  compound  Poisson  process  with  rate  A,  in 
which  multiple  occurrences  are  permitted.  We  derived  the  mean-time  function, 
E[S(t)},  which  is  pAt,  and  the  variance-time  function,  V[5(t)],  which  is 
Ai(cr2  +  p2).  Accordingly,  the  index  of  dispersion  for  the  compound  Poisson 
process  is 


qs(t)i  tr 

B[S(t)] 


(5.1) 


Question  5.2  Find  the  index  of  dispersion  for  the  compound  Poisson  process 
described  in  Question  4.5,  and  interpret  your  answer.  □ 


Results  (4.8)  and  (4.9) 


S(t)  is  the  number  of  events  in 
(0, t]  and  E{Yi)  =  p,  V(Yi)  =  cr2, 
Yi  being  the  number  of  events  at 
the  ith  compound  event. 


Question  5.3  Use  Formula  (5.1)  to  find  the  index  of  dispersion  for  a  compound 
Poisson  process  {S(t)\t  >  0}  when  Y ,  the  number  of  events  at  each  compound 
event,  is  distributed  as 

(i)  G0(a); 

(ii)  Poisson(^).  □ 


For  the  geometric  distribution  in  Question  5.3(i),  the  index  of  dispersion  of  the 
compound  Poisson  process  is 


1  +  a  i  ^  2a 
1  —  a  1  —  a 


l  +  2p, 


where  /j,  —  E(Y).  For  the  Poisson  distribution  in  part  (ii),  the  index  of  dispersion 
of  the  compound  Poisson  process  is  1  +  /i.  In  each  case  the  index  of  dispersion  is 
of  the  form 


1  +  expression  involving  /q 

so  that  I(t)  is  greater  than  1.  In  these  cases  the  compound  Poisson  process  with 
rate  A  displays  more  variability  than  does  a  simple  Poisson  process  with  rate  A. 
This  is  a  reasonable  result,  for  in  the  compound  Poisson  process  there  is 
variability  not  only  in  the  occurrence  of  compound  events  but  also  in  the  number 
of  events  associated  with  each  compound  event. 


53 


6  Further  examples  of  random  processes 


In  this  brief  section  we  take  a  look  at  some  further  examples  of  random  processes, 
beginning  with  a  model  for  population  growth.  This  is  an  example  of  a  large  class 
of  random  processes  which  will  be  explored  in  Units  7  and  8. 

6.1  The  simple  birth  process 

The  simple  birth  process  is  used  to  model  the  growth  of  a  population  whose  size 
increases  with  time  as  offspring  are  born  to  existing  individuals  in  the  population. 

In  the  Poisson  process,  the  probability  that  an  event  occurs  in  the  small  interval 
[f,  t  +  St]  is  constant;  in  the  simple  birth  process,  this  probability  is  modelled  as 
follows.  Suppose  that  at  time  t  =  0  there  is  a  single  individual,  who  gives  birth  to 
further  individuals  according  to  a  Poisson  process  with  rate  0.  As  soon  as  a  new 
member  is  born,  it  also  starts  giving  birth  independently  at  the  same  rate  0. 

There  are  no  deaths;  the  population  simply  increases  as  time  passes.  This  model 
provides  a  reasonable  fit  to  the  increase  in  a  colony  of  bacteria  or  cells  where 
reproduction  occurs  by  splitting  of  cells  and  is  independent  of  the  age  of  the 
individual.  If,  at  some  stage,  the  population  has  reached  size  x,  then  all  x 
members  are  independently  capable  of  giving  birth,  so  the  overall  birth  rate  will 
be  0x.  The  postulates  of  the  simple  birth  process  are  set  out  below. 

The  simple  birth  process 

If  a  population  is  of  size  x  at  time  t,  then: 

I  the  probability  of  (exactly)  one  birth  during  the  small  time  interval 
[i,  t  +  <ft]  is  equal  to  0x5t  +  o(<5i); 

II  the  probability  of  more  than  one  birth  during  [£,  t  +  5t]  is  equal  to  o(St)\ 

III  for  any  individual  alive  at  time  t,  the  incidence  of  births  after  time  t  is 
independent  of  the  incidence  of  births  before  time  t. 

Again  there  are  several  sequences  of  random  variables  associated  with  this 

process.  We  might  be  interested  in:  {X(t);  t  >  0},  the  size  of  the  population  at  X(0)  =  1 

time  t;  { Tn ;  n  =  1,2,...},  the  time  between  the  (n  -  l)th  and  nth  births;  or 

{Wn;  n  =1,2,.. .},  the  time  up  to  the  nth  birth,  that  is,  until  the  population 

reaches  size  n  +  1.  There  are  other  sequences  that  might  be  of  interest  in  a 

particular  application— for  example,  one  might  wish  to  follow  the  offspring  of  a 

particular  individual— but  {X(f)},  {Tn}  and  {Wn}  are  the  most  common.  We  can 

state  the  distribution  of  Tn  from  results  we  already  know.  Since  the  population 

starts  with  one  individual,  after  n  —  1  births  there  are  n  individuals  alive.  Each  of 

those  is  giving  birth  independently  according  to  a  Poisson  process  with  rate  0. 

Hence,  considering  the  process  as  a  whole,  the  time  until  the  next  birth  (the  nth) 
is  the  minimum  of  n  exponential  variates,  each  with  distribution  M{0).  It  follows 
that 

Tn  ~  M(n0),  gee  Subsection  5.2  of  Unit  1. 

the  time  between  the  (n  —  l)th  birth  and  nth  birth  has  an  exponential 
distribution  with  mean  l/(n0).  As  the  population  size  grows,  the  rate  of  births 
increases  and  the  population  grows  more  rapidly.  However,  because  of  the  third 
postulate,  all  the  random  variables  Tn  are  mutually  independent. 


54 


Question  6.1 

(i)  Initially  there  is  one  individual  alive  in  a  simple  birth  process.  Assuming  a 
birth  rate  /3,  state  the  distribution  of  7\,  the  waiting  time  to  the  first  birth, 
and  T2,  the  waiting  time  from  the  first  to  the  second  birth. 

(ii)  Describe  a  method  for  simulating  the  times  of  births  in  a  simple  birth 
process. 

(iii)  When  @  =  1,  simulate  the  times  of  the  first  five  births  in  a  simple  birth 
process,  using  row  6  of  the  table  of  random  numbers  from  the  M(  1) 
distribution  on  page  43  of  Neave.  □ 

In  Figure  6.1  we  continue  this  simulation,  and  show  the  graph  of  a  realization  of 
m,  the  size  of  the  population,  up  to  X(t)  =  20.  You  can  see  that  the  rate  of 
births  increases  with  time. 


Figure  6.1  Realization  of  a  simple  birth  process 

The  simple  birth  process  will  be  discussed  in  detail  in  Unit  7,  and  the  distributions 
of  X(t)  and  Wn  will  be  derived  there.  We  can  already  write  down  the  mean  and 
variance  of  Wn,  the  time  to  the  nth  birth,  using  the  results  obtained  above. 

Question  6.2 

(i)  Write  down  the  mean  and  variance  of  the  waiting  time  T*  in  a  simple  birth 
process. 

(ii)  Hence  state  the  mean  and  variance  of  Wmi  the  time  to  the  nth  birth  in  a 

simple  birth  process,  using  the  fact  that  \Yn  =TX~  T2 - f  Tn.  □ 

6.2  Further  examples 

Example  6.1  Queues 

Very  many  probability  models  have  been  developed  to  describe  various  queueing 
situations.  The  simplest  situation  is  that  of  a  ticket  queue  at  a  box  office  where 
customers  arrive  and  join  the  end  of  the  queue;  they  eventually  reach  the  front 

and  are  served,  then  they  leave.  Sometimes  an  arriving  customer  will  find  there  is  Several  queueing  processes  are 
no  queue  and  so  he  is  served  immediately.  developed  and  analysed  in  Unit  9. 

The  basic  stochastic  process  for  a  simple  queueing  model  is  {Q(t);t  >  0},  where 
Q(t)  is  the  number  of  people  in  the  queue  at  time  t.  Other  processes  of  interest 
include  {Ln;  n  =  1, 2, . . .},  where  Ln  is  the  number  of  people  in  the  queue  when 
the  nth  customer  arrives,  and  { Wn\ n=  1,2,.. .},  where  Wn  is  the  time  that  the 
nth  customer  has  to  wait  before  being  served.  Notice  that  Q(t)  is  defined  for 
continuous  time  but  Ln  and  Wn  for  discrete  ‘time’  (here  n.  the  ‘time’  variable, 
denotes  the  number  of  the  customer).  Notice  also  that  {Q(£)}  and  {Ln}  have 
discrete  state  spaces  but  { Wn }  has  a  continuous  state  space. 

To  define  the  model  completely  we  need  to  specify  the  arrival  and  service 
mechanisms.  The  commonest  and  simplest  assumption  is  that  customers  arrive 
according  to  a  Poisson  process.  Frequently,  the  service  time  is  assumed  to  have  an 
exponential  distribution.  Different  assumptions  (for  example,  non-constant  A)  are 
made  where  appropriate.  □ 


55 


Question  6.3  Suppose  that  customers  arrive  according  to  a  Poisson  process, 
with  rate  A,  and  that  the  service  time  is  fixed,  equal  to  one  unit  of  time,  for  each 
customer.  Sketch  rough  graphs  of  how  Q(t ),  the  number  of  people  in  the  queue, 
might  develop  from  Q(0)  =  3  for  the  cases  A  <  1,  A  >  1  separately.  Do  not  try  to 
simulate  the  process  too  carefully.  □ 

The  simple  queueing  model  in  Example  6.1  can  be  extended  in  many  ways. 
Possibilities  include:  more  than  one  server;  arrival  according  to  an  appointments 
system;  arrival  of  customers  in  batches;  ‘baulking’,  which  is  a  term  used  to 
describe  the  phenomenon  of  a  long  queue  discouraging  customers  from  joining. 

Example  6.2  Light  bulbs 


This  is  an  example  of  a  renewal 
process ;  such  processes  are 
discussed  in  Unit  13. 


Question  6.4  Identify  two  possible  sequences  of  random  variables  associated 
with  this  process.  □ 

The  model  in  Example  6.2  can  be  used  to  assist  in  the  development  of  strategy  for 
more  complex  situations.  How  should  motorway  lights  be  replaced?  Is  it  most 
economic  to  replace  a  whole  section  at  once  before  many  have  failed?  How  soon 
should  this  be  done? 


Suppose  there  is  a  supply  of  light  bulbs  whose  lifetimes  are  identically  and 
independently  distributed.  One  bulb  is  used  at  a  time,  and  as  soon  as  it  fails  it  is 
replaced  by  a  new  one.  The  random  process  {Wn;  n  =  1, 2, . . .}  gives  the  sequence 
of  replacement  times.  If  the  distribution  of  bulb  lifetimes  is  exponential,  this  will 
be  a  Poisson  process.  However,  this  is  not  necessarily  the  case;  the  distribution 
may  take  other  forms,  such  as  the  gamma  distribution.  □ 


7  Deterministic  models 


In  this  unit  we  have  considered  some  examples  of  random  processes,  many  of 
which  will  be  analysed  later  in  the  course.  For  random  processes  such  as  the 
Poisson  process  or  the  Bernoulli  process,  the  mathematics  is  quite  simple  and  we 
can  calculate  the  probability  distributions  of  all  random  variables  of  interest. 
However,  even  for  some  processes  that  seem  fairly  simple,  such  as  the  development 
of  a  queue  or  the  spread  of  a  disease,  the  mathematics  can  become  extremely 
difficult,  or  even  impossible,  to  handle  analytically.  In  these  cases  it  is  difficult  to 
identify  the  fundamental  properties  of  a  process  and  the  extent  to  which  the 
behaviour  of  the  process  depends  on  the  values  of  the  parameters  involved. 

One  useful  approach  is  to  carry  out  simulations,  which  will  certainly  give  a  good 
indication  of  likely  and  unlikely  realizations.  Indeed,  this  may  be  the  only  feasible 
method  of  proceeding.  However,  many  thousands  of  simulations  are  required  to 
give  good  estimates  of  the  various  probabilities  and,  particularly  if  each 
realization  continues  for  many  steps,  this  is  expensive  in  computer  time.  For  a 
complete  description,  simulations  are  required  for  a  whole  range  of  parameter 
values,  and  it  is  possible  that  some  important  feature  of  the  model  may  be  missed 
because  a  particularly  relevant  combination  of  parameter  values  was  not  used  in 
the  simulations. 

Another  approach  is  to  study  the  deterministic  version  of  the  model,  which 
ignores  landom  fluctuations  and  investigates  only  the  ‘average’  behaviour.  When 
a  population  is  large,  a  deterministic  model  provides  a  good  approximation  to  the 
stochastic  behaviour,  though  it  is  not  much  help  in  describing  the  behaviour  of  a 
small  population  subject  to  random  variation.  In  this  section  deterministic  models 
are  developed  for  some  simple  processes,  in  continuous  time  only:  in  this  course 
we  do  not  consider  deterministic  analogues  for  discrete-time  random  processes. 


56 


At  this  stage  of  the  course  you  do  not  need  to  worry  about  the  details  of  the 
models;  simply  note  the  general  approach  of  the  method. 

Example  7.1  The  Poisson  process 


Z  A 


Figure  7.1  Three  simulations  X(t)  =  z  of  a  Poisson 
process  {X(f);f  >  0} 

Figure  7.1  shows  three  simulations  X(t)  =  z  of  {X(£);£  >  0},  where  X(t)  is  the 
number  of  events  to  have  occurred  by  time  £  in  a  Poisson  process  with  rate  A  =  |. 
In  this  example  we  know  that  X(t)  ~  Poisson(A£)  and  so  E[X(t)]  —  X t.  So  the 
expected  number  of  events  to  have  occurred  by  time  £  in  these  simulations  is  |£, 
and  the  line  2  =  if  is  also  included  in  Figure  7.1.  This  line  represents  some  kind 
of  ‘trend’  or  average:  it  could  not  be  a  simulation,  as  simulations  are  all  step 
functions,  incorporating  jumps  of  size  1  at  random  times.  □ 

Example  7.2  The  simple  birth  process 

Figure  7.2  shows  four  simulations  of  a  simple  birth  process  {X(t)-,t  >  0}  with  rate 
(3=1,  starting  with  one  individual  at  time  £  =  0.  At  time  £  =  3,  when  simulation 
ceased,  the  size  of  the  population  had  increased  to  51  in  one  simulation,  but  to 
only  5  in  another. 


Figure  7.2  Four  simulations  X(t)  =  z  of  a  simple  birth 
process  {X(t)]t  >  0} 

Although  the  four  simulations  are  very  different,  in  all  cases  the  overall  birth  rate, 
or  rate  of  change  of  X(£),  appears  to  increase  with  time.  It  looks  as  if  no  straight 
line  could  provide  an  ‘average’  of  the  behaviour;  a  curve  with  increasing  slope 
would  be  more  appropriate.  □ 


Here  z  is  just  a  convenient  label  for 
the  vertical  axis  in  Figure  7.1. 


The  postulates  for  the  simple  birth 
process  are  given  on  page  61. 


57 


Before  finding  this  curve  for  the  simple  birth  process,  we  shall  return  to  the 
Poisson  process  to  demonstrate  the  general  method. 


Example  7.1,  continued 


We  shall  now  see  how  to  construct  a  deterministic  model  for  the  Poisson  process. 
From  Postulate  I,  we  know 

P(one  event  occurs  in  [£,  t  +  &])  =  A  St  +  o(St), 

and  since  multiple  events  have  probability  o(St), 

P(no  event  occurs  in  [t,  t  +  &])  =  1  -  (A St  +  o{St ))  -  o(St)  =  1  -  A St  +  o{St) 

Hence  the  expected  number  of  events  to  occur  in  [t,  t  +  St]  is 

1  x  {XSt  +  o{St))  +  0  x  (1  -  A 8t  +  o(St))  =  XSt  +  o{St). 

In  the  deterministic  model,  we  assume  that,  in  any  small  time  interval  [t,t  +  St], 
the  number  of  events  to  occur  is  exactly  equal  to  A  St  +  o(<5£);  that  is,  we  ignore 
random  fluctuations.  Now  let  2  =  z(t)  denote  the  total  number  of  events  to  have 
occurred  by  time  t  in  the  deterministic  model,  so  that 

z(t  +  St)  =  z{t)  +  A  St  +  o(St), 


and  hence 


z(t  +  St)-z(t)  x  ,  o(<5£) 
—  A  -f- 


6t 


St 


Since 


o(St) 


0  as  St  — >  0,  taking  the  limit  as  St  — >  0  gives 

dz(t) 


dt 


=  A. 


So,  in  this  model,  the  rate  at  which  events  occur  is  a  constant,  A.  Solving  this 
differential  equation,  we  obtain  z(t)  =  X t  +  c. 


Since  the  process  starts  at  time  t  =  0  with  no  events,  2(0)  =  0  and  so  c  =  0. 
Hence  the  deterministic  solution  is 


z(t)  =  X  t, 

which  gives  the  number  of  events  to  have  occurred  by  time  t  in  the  deterministic 
analogue  of  a  Poisson  process.  In  this  case,  the  deterministic  solution  is  equal  to 
the  mean  of  the  random  variable  X{t):  we  say  that  it  is  equal  to  the  mean  of  the 
random  process.  □ 


Question  7.1 

(i)  Derive  a  differential  equation  satisfied  by  the  size,  z(t),  of  the  population  at 
time  t  in  the  deterministic  analogue  of  the  simple  birth  process  with 
parameter  (3. 

(11)  Solve  this  equation,  for  a  process  starting  with  one  individual  at  time  0.  □ 

It  follows  from  Solution  7.1(ii)  that,  when  (3  =  1,  the  solution  to  the  deterministic 
analogue  of  the  simple  birth  process  starting  with  one  individual  is  the 
exponential  curve  z(t)  =  e‘,  and  in  Figure  7.3  this  curve  is  superimposed  on  the 
four  simulations  of  Figure  7.2.  The  curve  does  indeed  provide  an  ‘average’  of  the 
developing  process,  though  it  gives  no  indication  of  the  amount  of  possible 
variability. 


Note  that  in  the  deterministic 
model  2  is  a  function  of  time. 


58 


z 


Figure  7.3  Simulations  of  the  simple  birth  process  and 
the  deterministic  curve 

Example  7.3 

In  Example  4.1  we  consider  :d  a  nijdel  representing  the  number  of  accidents 
experienced  by  a  girl  learning  *  ride  a  bicycle.  The  probability  that  an  accident 
occurs  in  the  interval  is  '24  (2  +  t)]6t  +  o(St),  irrespective  of  how  many 

accidents  she  has  had  previ  usiy.  so  the  expected  number  of  accidents  in  the 
interval  [£,  t  +  St]  is  24  _  —  *  'St  +  o(St). 

To  derive  a  deterministic  model  for  this  non- homogeneous  Poisson  process,  let 
z(t )  be  the  number  of  accidents  to  have  occurred  by  time  t.  Then  the  number  of 
accidents  to  occur  by  rime  :  —  St  is 

z(t  +  St)  =  z  :  -  z^—St  -I-  o(St). 

Therefore 

z(t  +  6t)  —  z(t)  _  24  o(St) 

It  “  2+7  ~  ~sT' 

and,  letting  St  -+  0.  we  obtain 
dz(t)  _  24 
dt  2  —  t 

We  have  derived  a  differential  equation  for  the  process,  and  we  now  proceed  to 
solve  it  by  the  method  of  separa:ion  of  variables: 

z(t)  =  j  fzrtdt  =  24l°g(2 +t)  +  c. 

We  assume  that  at  time  t  =  0,  :  f  =  0,  so  we  obtain  c  =  —24  log  2.  The  final 
solution  is  therefore 

z(t)  =  24  log(2  + 1)  -24  log  2 
2  +  f 

=  24  log  - 
=  241og(l  + 1/2). 

Hence,  for  example,  in  the  deterministic  model  the  number  of  accidents  in  the 
first  ten  days  is  241og(l  +  10/2  ~  43,  and  in  the  first  100  days  it  is  94. 

In  Question  4.1,  you  showed  that  in  the  stochastic  model  the  mean  number  of 
accidents  by  time  t  is  equal  to  24log(l  + 1/2).  So  you  can  see  that  z(t )  is  actually 
equal  to  the  mean  number  of  accidents  by  time  t  using  the  stochastic  model.  This 
is  not  a  general  property,  but  in  many  simple  cases  the  deterministic  solution  is 
equal  to  the  mean  of  the  stochastic  model.  □ 


59 


This  method  of  deriving  a  deterministic  analogue  generalizes  to  many  stochastic 
processes.  Suppose  that  {X(t);t  >  0}  is  a  stochastic  process  in  continuous  time, 
representing  the  number  of  events  that  have  occurred  by  time  £.  If  a  total  of 
X(t)  =  z  events  have  occurred  by  time  £,  then  we  assume  that: 

I  the  probability  that  (exactly)  one  event  occurs  in  the  interval  [£,  £  +  St]  is 
h(z,t)St  +  o(St),  where  h(z,t)  is  some  function  of  2  and  £; 

II  the  probability  that  two  or  more  events  occur  in  the  interval  [£,  £  +  St]  is  o(St). 

So  we  have  a  process  where  events  occur  singly,  and  the  probability  of  an  event 
occurring  in  the  interval  [t,  t  +  St]  may  depend  on  2  or  £  or  both. 


Then  the  expected  number  of  events  occurring  in  the  interval  [£,  £  +  <ft]  is  equal  to 
h(z,t)St  +  o(St),  so  in  the  deterministic  analogue,  where  2  =  z(t), 

z(t  +  St)  =  z(t)  +  h(z,  t)St  +  o(<5£), 


and,  letting  St  — >  0,  we  obtain 


dz(t ) 
dt 


h(z,t). 


(7.1) 


The  solution  to  this  differential  equation  is  called  the  deterministic 
approximation  to  the  stochastic  process  {X(£);£  >  0}.  It  can  be  shown  that 
if  h(z,t)  is  a  linear  function  of  2,  that  is  if  h(z,t)  =  h0(t )  +  zh^t)  for  some 
functions  h0(t)  and  hi(t),  then  the  deterministic  approximation  is  equal  to  the 
mean  of  the  stochastic  process. 


For  the  Poisson  process,  h(z,  t)  =  A  is  linear  in  2  (hi(t)  =  0)  and  the  deterministic 
solution,  z(t)  =  A £,  is  equal  to  the  stochastic  mean.  For  the  simple  birth  process, 
h{z,t)  =  pz,  and  for  the  cycling  example,  h{z, £)  =  24/(2  +  £).  In  both  these  cases 
h(z,  t )  is  a  linear  function  of  2  and  so  the  deterministic  approximation  is  equal  to 
the  mean  of  the  stochastic  process. 

If  h(z,t)  is  not  of  the  form  ho(£)  +  zhi (£),  then  the  deterministic  approximation 
still  gives  a  useful  indication  of  trend,  but  it  is  not  equal  to  the  stochastic  mean. 
In  these  cases,  however,  the  full  stochastic  solution  is  often  difficult  to  derive  and 
this  is  why  the  deterministic  approach  is  useful. 


Deterministic  models  are  next  considered  in  Unit  7  of  this  course. 


The  proof  of  this  result  is  beyond 
the  scope  of  this  course. 


Examples  of  this  occur  in  Units  10 
and  11. 


60 


Objectives 


After  studying  this  unit  you  should  be  able  to 

understand  and  use  various  concepts  associated  with  discrete  and  continuous 
bivariate  distributions; 

give  examples  of  the  physical  concept  of  a  random  process,  and  be  able  to  identify 
sequences  of  random  variables  associated  with  such  a  process; 

identify  the  range  of  the  time  variable  and  the  state  space  of  a  random  process 
{X(t)}  or  {Xn}; 

give  rough  sketches  of  the  way  a  random  process  might  develop  over  time; 

understand  the  properties  of  the  Bernoulli  process  and  the  Poisson  process  and 
calculate  probabilities  associated  with  these  processes; 

define  the  random  variables  denoted  by  X(t),  X(ti,t2),  Tn,  Wn  for  a  point 
process  and  state  their  distributions  for  a  Poisson  process; 

understand  various  extensions  of  the  Poisson  process,  including  the  multivariate, 
the  non-homogeneous  and  the  compound  Poisson  processes,  and  calculate 
probabilities  associated  with  these  processes; 

simulate  realizations  of  simple  random  processes; 

calculate  and  interpret  the  index  of  dispersion  of  point  processes  in  simple  cases; 

understand  the  idea  of  a  deterministic  analogue  of  a  continuous-time  random 
process. 


Appendix:  Solutions  to  questions 


Section  1 


1.1  Summing  rows  and  columns  gives  the  following 
marginal  distributions. 


X 

1 

2 

y  1  0  l  2 

px(x) 

0.5 

0.5 

pY(y)  |  0.2  0.5  0.3 

1.2  Since 

p(y|0)  = 
we  have 
P(0|0)  = 

p(l|0)  = 

p(2|0)  = 


p(o,y) 
px{  o) 

p(Q,Q) 

Px(  o) 

p(  0-1) 

Px(0) 

P(0.2) 


0.1 

0.6 

0.4 

0.6 

0.1 

0.6 


1 

6’ 

2 

3’ 

1 

6' 


Px(0) 

So  the  conditional  probability  function  is  as  shown  in  the 
table  below. 


y 

0  1  2 

p(y|o) 

12  1 

6  3  6 

1.3  Since 


P(z|l) 


P(^l) 

py(  1)  ' 


we  have 


p(o|i)  = 
P(3|l)  = 


P(0-1) 
Py(  1) 
P(3,l) 
py(l) 


04_2 
0.6  “  3’ 
02_  1 
06  “  3' 


So  the  conditional  probability  function  is  as  shown  in  the 
table  below. 


X 

0  3 

p(*ll) 

2  1 

3  3 

1.4  The  marginal  distributions  of  the  joint  probability 
functions  are  as  follows. 


(i) 

x 

0 

1 

y 

0 

l 

px{x) 

0.5 

0.5 

py{y) 

0.6 

0.4 

(ii) 

X 

0 

1 

y 

0 

1 

PX  (x) 

0.5 

0.5 

py{y) 

0.6 

0.4 

X 

0  1 

y 

0  1 

px(x) 

0.75  0.25 

Pr{y) 

0.6  0.4 

Independence 

(i)  px  0)py  0  =  0.5  x  0.6  =  0.3  =  p(0,  0),  for  example; 
and  in  fact  p  x.y)  =  Px{x)py(y)  holds  for  each  of 
the  four  cells.  Hence  X  and  Y  are  independent. 

(ii)  p(0, 0)  =  0.2  but  px(0)py(0)  =  0.3.  so  X  and  Y  are 
not  independent. 

(iii)  X  and  Y  are  independent. 

Examples  (i)  and  (ii)  illustrate  the  fact  that,  although  for 
any  joint  distribution  the  marginal  distributions  are 
uniquely  determined,  the  converse  is  not  true:  for  any 
pair  of  marginal  distributions  there  are  many  possible 
joint  distributions. 


1.5  (i)  Using  the  conditional  distribution  in 
Solution  1.2, 

E{Y\X  =  0)  =  0xj  +  lx|+2x|  =  l. 

On  average,  Mary  receives  one  letter  in  a  week  when  she 
does  not  write  any  letters. 

(ii)  Using  the  conditional  distribution  in  Solution  1.3, 
E(X\Y  =  l)  =  0xf+3xj  =  l. 

The  average  number  of  letters  Mary  writes  in  a  week 
when  she  receives  one  letter  is  one. 

1.6  (i)  The  marginal  distributions  of  X  and  Y  are 
given  below. 


X 

1 

2 

y 

0 

1 

2 

px(x ) 

0.35 

0.65 

Pv{y) 

0.25 

0.35 

0.4 

(ii)  The  conditional  distribution  of  Y  given  X  =  1  is  as 
follows. 


y 

0  1  2 

p(y|i) 

1  4  2 

7  7  7 

Therefore 

E(Y\X  =  l)  =  0xi  +  lxf+2xf  =  f  =  li. 

The  conditional  distribution  of  X  given  Y  =  2  is  given 
below. 


X 

1  2 

p{x\2) 

1  3 

4  4 

Therefore 

E(X\Y 

=  2)  = 

Section  2 

2.1  (i)  Xn  —  y;  Yj  represents  the  number  of  successes 
in  n  trials.  Consequently  Xn  ~  B(n,p). 

(ii)  xn  =  x„_i+yn. 

If  Xn-i  is  observed  to  be  equal  to  x,  then  Xn  =  x  +  Yn. 
So  Xn  can  take  only  the  value  x  or  x  +  1,  and 

P(Xn  =  x\Xn~\  =x)  =  P(Yn  =  0)  =  1  -  p, 

P(Xn  =X  +  l\Xn-l  =x)  =  P(Yn  =  1)  =  p. 

2.2  The  first  six  pairs  of  digits  from  the  27th  row  are 
25  27  83  09  89  97. 

The  probability  of  a  girl  is  0.47  (and  therefore  the 
probability  of  a  boy  is  1  —  0.47  =  0.53).  So  we  could  use 
the  first  47  pairs  from  the  100  pairs  00, 01, . . . ,  99  to 
represent  a  ‘girl’,  and  the  remaining  53  to  represent  a 
‘boy’: 

00, 01, . . . ,  46  :  a  girl 
47, 48, . . . ,  99  :  a  boy 

The  given  sequence  of  six  pairs  gives  the  realization: 
girl  girl  boy  girl  boy  boy 

2.3  (i)  For  x  >  1, 

P(Tk  =  x)  =  P(x  —  1  failures  followed  by  a  success) 

X— 1 

=  q  p. 

So  Tfc  has  a  geometric  distribution  starting  at  1  with 
parameter  p,  and  range  {1, 2, . . .};  that  is,  Tk  ~  (?i(p). 

(ii)  The  variates  {Tk}  are  independently  and  identically 
distributed  as  Gi(p).  The  distributions  do  not  depend  on 
k  or  on  previous  observations. 


62 


2.4  (i)  If  the  customer  has  i  different  cards  after  n  -  1 
purchases,  the  probability  that  he  collects  a  new  one  at 
the  nth  purchase  is  (20  -  i)/ 20.  Hence  Xn  has  a 
Bernoulli  distribution  with  p  =  1  -  i/20. 

(ii)  No,  this  is  not  a  Bernoulli  process  because  the 
probability  of  ‘success’  (receiving  a  new  card)  changes 
during  the  process  as  more  new  cards  are  acquired. 

(iii)  Three  possible  sequences  are  given  below:  there  are 
many  others. 

{Yn\n  —  1,2,...},  where  Yn  denotes  the  number  of 
different  cards  collected  by  the  nth  purchase.  Both  time 
and  Yn  are  discrete.  The  state  space  of  {Tn}  is 
{1,2,...,  20}. 

{Tfc  :  k  =  1,2,...,  20},  where  Tk  is  the  number  of 
purchases  after  the  customer  has  k  —  1  different  cards 
until  and  including  the  purchase  at  which  he  receives  the 
kth  different  card.  Both  time  and  Tk  are  discrete.  The 
state  space  of  {Tk}  is  {1,  2, . . .}. 

[Wk\ k  =  1,2,...,  20},  where  Wk  is  the  total  number  of 
purchases  required  for  the  customer  to  obtain  k  different 
cards.  Both  time  and  Wk  are  discrete.  The  state  space  of 
{Wk}  is  {k,  k  +  1, . . .}. 

2.5  (i)  It  is  not  a  Bernoulli  process  because  trials  on 
successive  days  are  not  independent. 

(ii)  The  state  space  of  {Xn}  is  {0, 1}. 

2.6  (i)  Examples  of  suitable  random  processes  are  as 
follows. 

{X(t)\t  >  0},  where  X(t)  is  the  size  of  the  colony  at  time 
t.  X(t)  is  discrete,  t  is  continuous;  so  the  state  space  is 
discrete  and  the  time  domain  is  continuous.  The  state 
space  is  {0, 1,2,.. .}. 

(Tfc;  k  =  1,2,...},  where  the  Tk  are  the  times  between 
successive  events,  either  divisions  or  deaths.  The  state 
space  {t :  t  >  0}  is  continuous  and  the  time  domain  is 
discrete. 

{Tie;  k  =  1,2,...},  where  the  Yk  are  the  times  between 
successive  divisions.  The  state  space  {y  :  y  >  0}  is 
continuous  and  the  time  domain  is  discrete. 

{ Zk ;  k  =  1,2,...},  where  the  Zk  are  the  times  between 
successive  deaths.  The  state  space  [z  :  2  >  0}  is 
continuous  and  the  time  domain  is  discrete. 

Other  possible  processes  include  {Wk]k  =  1,2,...}, 

where  Wk  is  the  time  to  the  kth  event,  and 

{Dk.)  k  =  1,2,...},  where  Dk  is  the  time  to  the  fcth  death. 

(ii)  Assuming  that  only  one  event  (i.e.  one  division  or 
one  death)  occurs  at  any  given  time,  your  sketch  should 
show  steps  of  one  unit  up  or  down  at  random  times.  If 
the  size  X(t)  reaches  0  it  remains  there,  as  there  are  then 
no  bacteria  to  divide. 


Section  3 

3.1  The  average  rate  of  occurrence  of  impulses  is 
A  =  458  per  second  and  the  duration  of  the  interval  of 
interest  is  t  —  0.01  seconds.  The  number  X  of  impulses  to 
occur  has  a  Poisson  distribution  with  mean 

A t  —  (458  per  second)  x  (0.01  seconds)  =  4.58. 

The  probability  that  X  does  not  exceed  1  is  given  by 
P(X  <  1)  =  P{X  =  0  or  1) 

-4.58  ,  e-4'58  x  4.58 

=  e  + — u — 

~  0.0572. 


3.2  The  eruption  rate  is  A  =  1  per  29  months  =  1/29 
per  month. 


(i)  The  number  of  eruptions  during  the  five-year  period 
specified  has  a  Poisson  distribution  with  mean 

At  =  (^  per  month)  x  (60  months)  ~  2.069. 


(ii)  The  probability  of  exactly  two  eruptions  is  given  by 


-At 


(At); 


’ (2.069)' 


2! 


2! 


~  0.2704. 


(iii)  By  the  memoryless  property  of  the  exponential 
distribution,  the  time  between  the  start  of  observation 
and  any  previous  event  can  be  ignored.  If  T  is  the  time 
from  the  start  of  observation  until  the  next  earthquake, 
then 


P{T  >t)  =  e~Xt. 

In  this  case,  t  =  3  years  =  36  months,  so 
P{T  >  36)  =  e_36/29  ~  0.2890. 


3.3  As  Wn  =  Ti  +  T2  +  •  •  ■  +  Tn  is  the  sum  of  n 
independent  exponential  random  variables  T\ ,  T2 , . . . ,  Tn , 
each  having  parameter  A,  it  has  a  gamma  distribution 
T(n,  A).  [See  Unit  1 ,  Subsection  5.3.] 


3.4  Using  the  fifth  row  of  exponential  random  numbers, 
the  simulation  goes  like  this. 


n 

en 

tn  —  652  Gn 

Wn  =fH - b  tn 

1 

0.2145 

139.9 

139.9 

2 

2.5019 

1631.2 

1771.1 

3 

1.3019 

848.8 

2619.9 

4 

1.6369 

1067.3 

3687.2 

The  first  three  failures  occur  at  times 

wi  =  139.9,  w2  =  1771.1,  w3  =  2619.9 
(in  seconds);  that  is,  140  seconds,  1771  seconds  and 
2620  seconds  (to  the  nearest  second).  The  fourth  failure 
occurs  at  time  3687.2,  after  the  first  hour  of  usage. 


x(t)  , 
6- 

5  - 
4  - 
3  - 
2- 

1  - 


3.5  Setting  x  =  2  in  the  differential  equation  (3.4)  gives 

=  Api(t)  -  Apa(f). 

Substituting  the  known  value  of  p\  (f )  gives 

^  =  A 
or 

^  +  A Mt)  =  A 

As  before,  the  method  is  to  multiply  both  sides  of  this 
equation  by  the  integrating  factor,  which  is  again  ext: 

At  dp2  (t)  .At  /  ,  \  \2, 

e  - t-  Ae  p2(t)  =  A  t. 


63 


The  left-hand  side  is  the  derivative  of  the  product 
eXt  x  p2(t),  so  we  can  write  the  differential  equation  as 

jt  [extp2{t))  =  A 2t. 

Integrating  gives 

eXtp2(t)  =  J  X2tdt  =  \\2t2  +c. 

At  time  t  =  0,  we  know  that  no  events  have  occurred,  so 
P2  (0)  =  P(X(0)  =  2)  =  0;  and  hence  c  =  0.  This  leaves 

e  p2{t)  =  f  A  t  , 
and  so 


W1  \  2,2  —At 

—  2  A  t  e  ; 
or,  rewriting, 


P2{t)  = 


e~Xt(\t)2 

2 


3.6  We  are  given  A  =  10  per  minute,  pA  =  0.6, 

Pb  —  0.3  and  pc  =  0.1;  so 
A^  =  Ap.4  =  6  per  minute, 

A b  =  A ps  =  3  per  minute, 

Ac  =  A  pc  =  1  per  minute. 

(i)  The  number  of  customers  X  to  arrive  in  a  time 
interval  of  duration  t  =  30  seconds  =  0.5  minutes  has  a 
Poisson  distribution  with  mean 

At  =  (10  per  minute)  x  (0.5  minutes)  =  5. 

Therefore 


P(X  >  5)  =  1  -  P(X  <  5) 


=  1  —  e-5 
~  0.3840. 


k2  -3  c4 

1  +  5+  2!  +  3!  +  4f  + 


(ii)  The  number  Y  of  customers  of  type  A  to  arrive  in 
1  minute  has  a  Poisson  distribution  with  mean 


A At  =  (6  per  minute)  x  (1  minute)  =  6. 
Therefore 

e~ 


P(Y  =  6)  = 


6! 


0.1606. 


(Or  use  the  appropriate  table  on  page  15  of  Neave.) 

(iii)  The  probability  requested  is  given  by  the  product 
e-e6e  e-333 


(i-“) 


6!  3! 

~  0.1606  x  0.2240  x  0.6321  ~  0.0227. 


3.7  Write  ‘students’  as  type  1,  ‘family’  as  type  2,  and 
‘friends’  as  type  3.  Then 

Ai  =  1  per  90  minutes  =  |  per  hour; 

A2  =  1  per  3  hours  =  |  per  hour; 

A3  =  1  per  hour. 

(i)  The  overall  rate  of  telephone  calls  is  given  by 
A  =  Ai  +  A2  +  A3  =  2  per  hour; 

so  over  a  period  of  t  =  2  hours,  the  expected  number  of 
calls  is  A t  =  4.  The  probability  that  there  will  be  no  calls 
is  given  by 

e~4  ~  0.0183. 


(ii)  The  proportion  of  calls  from  students  is  given  by 


Pi 


_ Ai _ 

Ai  +  A2  +  A3 


1/2  = 


1 

3  ’ 


so  the  probability  that  the  first  call  after  1 1  pm  is  from  a 
student  is  |. 


Section  4 


4.1  (i)  The  expected  number  of  accidents  in  the  first  t 

days  is  given  by: 


p(t)  =  f  A (u)du=  f  24  du 

Jo  Jo  2  +  u 


=  |24 log (2  +  u)J  ^  =  24 log (2  +  t)  -  24 log 2 
=  24 log  =  24 log  (l  +  !)  . 


(ii)  The  expected  number  of  accidents  during  the  first 
week  (t  =  7  days)  is 

p(7)  —  24  log  4.5  ~  36.1. 

Remember  that  ‘log’  refers  to  logarithms  base  e  and  that 
on  most  calculators  you  need  to  use  the  key  labelled  ‘In’. 


(iii)  The  expected  number  of  accidents  during  the  third 
week  (ti  =  14  days,  t2  =  21  days)  is 

p(14, 21)  =  p(21)  —  p(14)  =  24  log  11.5  —  24  log  8 
~  8.71. 


(iv)  The  expected  number  of  accidents  during  the  fourth 
week  is 

M21, 28)  =  /x(28)  -  p(21)  =  24  log  15  -  24  log  11.5 
~  6.38. 

The  number  of  accidents  follows  a  Poisson  distribution 
with  this  mean,  so  the  probability  that  the  fourth  week  is 
free  of  accidents  is 
e— e.38  _  0_0017_ 


4.2  (i)  The  expected  number  of  events  by  time  t  is 

given  by 

p(t)  =  f  A (u)du=  f  2u du  =  [u2]*  =  t2 . 

Jo  Jo  0 

(ii)  The  number  of  events  by  time  t  has  a  Poisson 
distribution  with  parameter  p(t)  =  t2.  Therefore  the 
probability  that  there  are  no  events  in  (0,t]  is  given  by 

e-^>=e-‘2. 


Since  [-AT(t)  =  0]  and  [Ti  >  t ]  are  equivalent  events,  we 
have 

P(Ti  >  t)  =  e_t\  t  >  0. 

(iii)  The  easiest  way  to  find  the  expected  value  of  Ti  is 
to  use  Formula  (4.1)  from  Unit  1: 

poo  />oo 

=  (1  -  F{t))  dt=  P(T,  >  t)  dt 

Jo  Jo 


-f 


e  1  dt. 


From  the  standard  result 

/°o  - - 

e~™2dw  =  ./I, 
v  Q 

it  follows  (setting  a  =  1)  that 

/OO 

e~w  dw  =  y/i r. 

•OO 

So,  since  e~w  is  symmetric, 

E(Ti)  =  jf  e~w2dw=\^- 


64 


(iv)  Since  T  >  t  if  and  only  if  there  are  no  events  in 
(w,t],  it  follows  that 

P(T  >  t)  =  P(X(w,t)  =  0). 

But  X(w,t)  ~  Poisson (p(w,t)),  where 
n(w,  t)  -  p(t)  —  n(w)  —  t2  —  w2 
(from  part  (i)),  so 

P(T  >  f)  =  e-(t2-t',2)J  t  >  w. 

4.3  From  Solution  4.2(i), 
n(t)  =  t2,  t  >  0. 

Using  Formula  (4.3).  ti  is  the  solution  of 

t\  =  -  log(l  -  mi)  =  -  log(  1  -  0.622)  ~  0.97286, 
and  so 

ti  ~  0.986. 

Using  Formula  (4.4  gives 
t22=t\-  log(  1  -  u2). 
which  leads  to 
t2  ~  1.116. 

Using  Formula  (4.4  again  gives 
tl=t2-  log(l  -  u3). 
which  leads  to 

t3  =  1.655. 

4.4  From  Solution  4.2  i).  Formula  (4.4)  reduces  to 

tj+i  =  ~  log(!  -  «)• 

so  that 

tj+i  =  \Jt)  -log(l  -  ti). 

Writing  to  =  0,  then 

1 1  =  x/~log(l  ~  0-927)  =  v/2.6173  =  1.618, 

t2  =  i/2.6 173  -  logl  l  -0.098)  =  V2.7204  =  1.649, 

t3  =  V2-7204  -log(  1  -0.397)  =  V3.2263  =  1.796, 

U  =  v/3.2263  -  logl  1  -  0.604)  =  V4.1526  =  2.038. 

(Full  calculator  accuracy  has  been  retained  throughout 
these  calculations.) 

4.5  (i)  The  expected  value  of  Y  is 
»  =  E(y)  =  Y,yP{Y  =  y) 

=  0  x  0.2  +  1  x  0.3  +  2  x  0.4  +  3  x  0.1 
=  0  +  0.3 +  0.8 +  0.3  =  1.4. 

The  expected  value  of  Y2  is 

E(Y2)  =  02  x  0.2  +  l2  x  0.3  +  22  x  0.4  +  32  x  0.1 
=  0  +  0.3+  1.6 +  0.9  =  2.8. 

So  the  variance  of  Y  is 

o-2  =  V(Y)  =  E(Y 2)  -  ( E(Y ))2 

=  2.8  -  1.42  =  0.84. 

(ii)  The  departure  rate  of  shoppers  is  1  per  5  minutes  or 
12  per  hour;  the  interval  of  observation  is  t  =  3  hours. 

So  the  mean  and  variance  of  the  total  number  of 
purchases  are  given  by 

E[S{ 3)]  =  1.4  x  12  x  3  =  50.4, 

V[S{3)]  =  12  x  3  x  (0.84  +  1.42)  =  100.8. 


4.6  If  Yi  ~  Go  (a),  then 
M  =  E{Xi)  =  ° 

<T2  =  ViYi)  = 
Therefore 


1  -a’ 
a 


(1-ay 
aXt 


S[5(f)]  =  »\t  =  , 

1  —  a 

U[S(t)]  =  A  f(cr2  +/r2)  =  A  t 


+ 


(1  -  a)2  (1  —  a)2 

a(l  +  a)\t 

(l-«)2  ' 


4.7  (i)  Let  the  random  variable  Yi  take  the  value  1  if 

the  event  is  recorded,  and  0  if  it  is  not.  That  is, 

P(Yi  =  l)  =  p,  P(Yi  =  0)  =  1  -  p, 
and  Yi  ~  B(l,p).  Therefore, 

H  =  E(Yt)  =  p,  a2  =  V(Yi)  =pq. 

Hence 

■E[5(f)]  =  p\t  =  p\t] 

V[5(t)l  =  At  (cr2  +  p2)  =  A  t(pq  +  p2)  =  Xtp(q  +  p)  =  pXt. 

(ii)  So  5(f)  has  equal  mean  and  variance.  So  far  we  do 
not  have  the  necessary  methods  for  deducing  the 
probability  distribution  of  S(t),  but  this  equality  suggests 
the  possibility  that  S(t )  ~  Poisson(pAf).  In  fact:  this  is 
so.  It  follows  from  a  consideration  of  Postulate  I  for  the 
Poisson  process.  If  the  probability  that  exactly  one  event 
occurs  in  the  time  interval  [t,  t  +  <$t]  is  equal  to 
X6t  +  o(6f),  and  the  probability  that  it  is  detected  is  p , 
then  the  probability  that  an  event  is  detected  in  the  time 
interval  [f,  t  +  <5f]  is  pXSt  +  o(5t).  Hence  events  are 
detected  according  to  a  Poisson  process  with  rate  pX,  and 
therefore  the  number  of  events  detected  in  an  interval  of 
length  t  has  a  Poisson  distribution  with  parameter  pXt. 


Section  5 


5.1  For  a  non-homogeneous  Poisson  process  we  know 
that  X(t)  ~  Poisson (p-(t)),  so 

E(X(t))  =  V(X(t))  =  p(t). 

Therefore 


That  is,  the  index  of  dispersion  for  a  non-homogeneous 
Poisson  process  is  the  same  as  that  for  a  Poisson  process. 


5.2  In  this  case,  using  Formula  (5.1), 

7W  =  7  +  "=Tf +  L4  =  2- 


The  index  of  dispersion  exceeds  1:  there  is  more 
dispersion  in  the  process  of  purchases  than  in  the  exit 
process  of  shoppers  (itself  a  Poisson  process).  The  extra 
dispersion  is  caused  by  the  variation  in  the  number  of 
purchases  per  shopper. 


65 


5.3  (i)  When  Y  ~  Go(cc),  we  have 

V  =  e(Y)  =  t^-,  a  2  =  V(Y)  = 
Therefore 


a 


m  =  - 


1  —  a 


H  = 


+ 


a/ (1  —  a) 

Q-  1  +  a 


1  —  a: 


a 

1  —  a 


(ii)  When  Y  ~  Poisson(/^),  we  have  a2  =  V(Y)  =  /x,  so 
I(t)  =  — h  /x  =  -  +  fx  =  1+  n- 

ix  fX 


Section  6 

6.1  (i)  When  there  is  one  individual  alive,  the  waiting 

time  Tj  to  the  next  birth  is  exponential  with  mean 
l/(3:Ti  ~  M(0).  When  there  are  two  individuals  alive, 
the  waiting  time  T2  to  the  next  birth  is  exponential  with 
mean  1/(2 (3)  :  T2  ~  M(2(3). 

(ii)  Since  Tn  ~  M(n0),  the  time  between  the  (n  -  l)th 
and  nth  births  can  be  simulated  by  using  a  random 
number  from  M(  1)  and  dividing  it  by  n(3. 


n 

Random  number 
from  M(  1) 

n/3 

Simulated 
time  interval 

Time  of 
nth  birth 

T 

1.1118 

1 

1.1118 

1.1118 

2 

1.9728 

2 

0.9864 

2.0982 

3 

0.6191 

3 

0.2064 

2.3046 

4 

0.0149 

4 

0.0037 

2.3083 

5 

0.5376 

5 

0.1075 

2.4158 

6.2  (i)  Since  Ti  ~  M(</3),  then 

V(T')  =  W' 

using  standard  results  for  the  exponential  distribution, 
(ii)  Wn  =  Ti  +  T2  H - 1-  Tn. 

The  expectation  of  the  sum  of  n  random  variables  is 
equal  to  the  sum  of  their  expectations,  so 

sw.)  =  i>m>  =  £(  i  +  5  +  -+l)- 

1  =  1 

The  variance  of  the  sum  of  independent  variates  is  equ 
to  the  sum  of  their  variances,  so 


6.3  The  important  features  to  notice  in  these  sketches 
are:  service  time  is  constant,  so  Q(t)  falls  by  1  every  unit 
of  time;  when  A  <  1,  so  that  the  mean  inter-arrival  time 
is  greater  than  1,  the  queue  length  will  tend  to  decrease 
and  may  well  be  zero  for  some  time;  when  A  >  1,  the 
queue  length  will  tend  to  increase. 


6.4  Possible  sequences  are: 

{X(t);t  >  0},  where  X(t)  is  the  number  of  bulbs  that 
have  failed  by  time  t; 

{Tk]  k  =  1,2,...},  where  Tk  is  the  lifetime  of  the  kth  bulb; 

{ Ak ;  k  —  1,2,...},  where  Ak  is  the  mean  lifetime  of  the 
first  k  bulbs. 


66 


Section  7 


7.1  (i)  The  first  postulate  for  the  simple  birth  process 

states  that,  for  a  population  of  size  x  at  time  t,  the 
probability  of  one  birth  during  the  time  interval  [f ,  t  +  St] 
is  equal  to  (3xSt  +  o(8t).  If  we  suppose  the  number  alive 
at  time  t  to  be  z(t),  then  the  expected  number  of  births 
in  [t,  t  +  5t]  is  (3z(t)5t  +  o(St)-,  so 

z{t  +  8t)  —  z(t)  +  0z(t)St  +  o(St). 


Hence 

z(t  +  8t)  —  z(t) 

It 


=  f3z(t) 


o(St) 

St 


Letting  St 
dz{t) 


dt 


0  gives 

=  Pz(t). 


(ii)  This  differential  equation  can  be  solved  by 
guesswork.  (What  function  differentiates  to  ft  times 
itself?  Answer:  Aept  for  any  constant  A.)  Or  it  can  be 


solved  more  formally  by  the  method  of  separation  of 
variables: 

rs-i“ 

gives 

log  z(t)  =  ,8t  +  c, 

where  c  is  a  constant.  Taking  exponentials,  the  left-hand 
side  becomes 

elog2(t)  =z{t) 

and  the  right-hand  side  becomes 
e0t+c  =  ece0t  =  Ae0t, 

where  A  is  a  constant.  We  know  therefore  that  the 
general  solution  is 

z{t)  —  Ae0t . 

We  are  told  that  z(t)  —  1  at  t  =  0;  this  gives  A  —  1,  and 
so  the  final  deterministic  solution  for  the  simple  birth 
process  is 

z(t)  =  e0t. 


67 


